Having a vehicle listen to a person who is outside it is a deceptively hard acoustics problem, and the difficulty is not the microphone — it is everything around the microphone. A person standing a few feet from a parked car, trying to be heard over an idling engine, a busy street, or wind, is a weak signal buried in a loud, diffuse field of noise. A single microphone has no way to separate the two: it sums the speaker and the world into one channel. Rolling a window all the way down to let the sound in only makes it worse, because an open window is also an open door for road and ambient noise. A newly published Hyundai Motor Company application, surfaced in this week's patent pub drop, is directed at exactly that problem. The hero of the cluster is US20260181346A1, titled "Method and Apparatus for Receiving a Voice From a Speaker Outside a Vehicle," a pending application dated June 25, 2026.

The approach the application discloses is to treat the problem as three coupled subproblems — locate, then ventilate, then aim — and to let the camera drive the audio. The method begins when an "external call mode," a mode dedicated to receiving voice from a speaker outside the vehicle, is activated; the disclosure notes this can be triggered by detection information about the speaker received from the camera itself. The system then determines position information of the speaker from a captured image. That position is not a single bearing: the application specifies it includes a relative position indicating whether the speaker is toward the front or the rear of the vehicle, and the speaker's height. With that geometry in hand, the vehicle opens a window — but only as far as a noise calculation permits, controlling the window up and down and settling it at the height where the measured amount of noise is lowest among the levels compared across various window heights. Finally, the system controls beamforming of the voice received by an array of microphones, using the speaker's position to steer the array.

A method for receiving voice, the method comprising: determining position information of a speaker outside a vehicle based on an image captured by a camera based on an external call mode for receiving voice from the speaker outside the vehicle being activated; opening a window of the vehicle based on an amount of noise; and controlling beamforming of voice received by an array of microphones based on the position information of the speaker.— Method and Apparatus for Receiving a Voice From a Speaker Outside a Vehicle, US20260181346A1

Why use a camera to aim a microphone?

The interesting engineering choice here is the cross-modal one: a vision system is used to point an acoustic system. Beamforming with a microphone array works by exploiting the tiny differences in arrival time of a sound at each microphone. If you deliberately delay the phase of each microphone's signal by the right amount before summing them, the signals from one chosen direction add up in phase and reinforce, while sound from every other direction arrives misaligned and partially cancels. The array becomes directionally selective — an electronically steerable ear with no moving parts. The catch is that delay-and-sum beamforming only helps if you know which direction to steer toward. The disclosed method answers that with the camera: the speaker's image-derived position, including front-or-rear and height, supplies the steering target, and the application specifies that the phase of the voice received by the array is delayed by a predetermined angle based on that position information. A dependent claim ties the phase delay specifically to a comparison between the speaker's height and the height at which the window is opened — the vertical geometry of where the sound has to travel through the window gap is folded directly into the beamforming math.

That coupling between the window and the beam is the part of the disclosure that rewards a second read. The window is not simply opened; it is positioned to minimize noise, and its opening height then becomes an input to the acoustic steering. The application describes the case where the window is opened to a height higher than the speaker — in that configuration the phase delay is computed from both the window-opening height and the speaker's position, and the method can even reset the window height based on comparing the speaker's height against the opening height. The window aperture, in other words, is treated as part of the acoustic path the beamformer has to reason about, not just a hole that lets sound in. Independent apparatus claims restate the method as a camera plus a "head unit" — the in-dash compute module — that performs the locate, ventilate, and aim steps, with one variant generalizing from an array to "one or more microphones."

Where this sits in the state of the art is worth stating plainly. Microphone-array beamforming and camera-assisted source localization are each mature techniques on their own; in-cabin voice systems already use arrays to favor the driver's seat over a passenger. What this application discloses is a specific construction aimed outward — using the vehicle's own exterior-facing camera to locate an outside speaker and steering the array through a deliberately, minimally opened window. The filing is classified under H04S 7/303, the spatial-audio control class, which places the invention's center of gravity in audio signal processing rather than in the driver-assistance control classes where much of Hyundai's autonomy work lives. One caveat is load-bearing: this is a pending application, not a granted patent. It describes what Hyundai's engineers disclosed and sought to protect, and the claims that eventually issue may be narrower than the abstract and the independent claims read today.

One filing in a broad week for Hyundai's R&D

Read alone, the hero filing is a self-contained human-machine-interface invention: a way for a car to hold a conversation with someone standing outside it — a drive-through, a parking attendant, a security gate — without the driver leaning out the window or shouting over engine noise. Read alongside its companions in the same June 25 pub drop, it is a reminder of how wide Hyundai's published-application footprint runs in a single week, spanning domains that have little to do with one another beyond the same assignee. The cluster reaches well outside the cabin. US20260180331A1 ("Apparatus and Method for Controlling Home Energy Management System") describes a controller that generates a control strategy for home power devices and distributed energy resources connected to a home energy management system, choosing between a rule-based algorithm and an optimization algorithm based on their performance and on user settings — an energy-management invention squarely in the vehicle-to-home and residential-power adjacency that automakers increasingly file in.

Two of the companions sit deep in video signal processing, a field that shares the array-and-filter mathematical heritage of the hero's beamforming but applies it to pixels rather than sound. US20260181160A1 ("Method and Apparatus for Video Coding Using an Improved In-Loop Filter") describes generating a residual frame from a reconstructed frame with a deep-learning model and applying it through a linear model to improve the codec's in-loop filter, while US20260181137A1 ("Method and Apparatus for Video Coding That Adaptively Determines Blending Area in Geometric Partitioning Mode") describes partitioning a block along a geometric split boundary and adaptively determining a blending region and blending matrix for weight-summing the two subregions' predicted signals. Both are compression-efficiency work of the kind that underpins in-vehicle cameras and streamed cabin displays.

Rounding out the drop are two pieces of hard hardware spanning powertrain and energy storage. US20260179979A1 ("Fuel Cell Apparatus for Ships") describes a marine fuel-cell stack with a hydrogen line and a cabinet partitioned into multiple spaces, ventilated by a device on its upper end to clear the compartment holding the stack and hydrogen line — a safety-driven packaging invention for hydrogen at sea. US20260180068A1 ("Battery Pack") describes a pack whose base plate and cover plate each carry a cooling-water channel, fed by inlet and outlet pipes routed between the two channels so coolant flows above and below the cells — a double-sided liquid-cooling architecture pulling heat from both faces. Taken as a set, the cluster is less a single coherent story than a snapshot of an R&D organization filing simultaneously across infotainment acoustics, residential energy, video codecs, marine hydrogen, and battery thermal design — with the outside-voice method the one that most directly reimagines how a person and a vehicle talk to each other.