Building semantic 3D maps can be valuable for searching offices, warehouses, stores and homes for objects of interest. We present a multi-camera mapping system that incrementally builds a Language-Embedded Gaussian Splat (LEGS), a detailed 3D scene representation that encodes both appearance and semantics in a unified representation. LEGS is trained online as the robot traverses its environment, enabling localization of open-vocabulary object queries. We evaluate LEGS on three room-scale scenes where we query random objects in the scene to assess the system's ability to capture semantic meaning. We compare our system to LERF for these three scenes and find that while both systems have comparable object query success rates, LEGS trains over 3.5x faster than LERF. Qualitative results suggest that multi-camera setup and incremental bundle adjustment boost visual reconstruction quality in constrained robot trajectories, and experimental results suggest LEGS can localize objects with up to 66% accuracy across three large indoor environments, and produce high fidelity Gaussian Splats in an online manner by integrating bundle adjustment updates.