Testing Browser Audio with Playwright

Testing visual elements is straightforward with Playwright, but how do you verify that a specific sound effect played when a user clicked a button or an in-game event occurred? Since automated browsers in CI pipelines don't have "ears," we have to get creative.

The easy answer is Monkey-patching for browser Web Audio APIs and the HTMLAudioElement to intercept audio playback and verify it in our test assertions.

The most reliable way to track audio in a browser is to override the global functions responsible for playing sound. This includes:

HTMLAudioElement.prototype.play(): For sounds triggered by <audio> tags or new Audio().play().

AudioBufferSourceNode.prototype.start(): For sounds played using the Web Audio API, common in games and complex interactive experiences.

By overriding these methods, we can record every attempt to play a sound file into a global array on the window object, which Playwright can then inspect.

Creating the Audio Helper (as a fixture) First, let's build a clean, AudioMonitor class and wrap it in a Playwright fixture. This ensures that every test gets its own isolated instance of the monitor, preventing "state leaking" when running tests in parallel.

// fixtures/audio-fixture.ts
import { test as base, Page } from '@playwright/test'

export class AudioMonitor {
  constructor(private page: Page) {}

  async start() {
    await this.page.addInitScript(() => {
      ;(window as any).soundsPlayed = []

      // --- 1. Patch Standard HTML5 Audio ---
      const originalHtmlAudioPlay = HTMLAudioElement.prototype.play
      HTMLAudioElement.prototype.play = function (...args) {
        const soundName =
          this.src.split('/').pop()?.split('.')[0] || 'unknown_html_audio'

        ;(window as any).soundsPlayed.push({
          name: soundName,
          src: this.src,
          type: 'HTMLAudio',
          timestamp: Date.now(),
        })

        return originalHtmlAudioPlay.apply(this, args)
      }

      // --- 2. Patch Web Audio API ---
      const originalBufferSourceStart = AudioBufferSourceNode.prototype.start
      AudioBufferSourceNode.prototype.start = function (...args) {
        const soundName =
          (this as any).audioName ||
          (this.buffer
            ? `buffer_len_${this.buffer.length}`
            : 'unknown_web_audio')

        ;(window as any).soundsPlayed.push({
          name: soundName,
          type: 'WebAudio',
          timestamp: Date.now(),
        })

        return originalBufferSourceStart.apply(this, args)
      }
    })
  }

  async getPlayedSounds(): Promise<string[]> {
    return await this.page.evaluate(() => {
      return (window as any).soundsPlayed?.map((s: any) => s.name) || []
    })
  }

  async verifyExpectedSounds(expectedSounds: string[]) {
    const playedSounds = await this.getPlayedSounds()
    const missing = expectedSounds.filter(
      expected => !playedSounds.some(played => played.includes(expected))
    )
    return { played: playedSounds, missing }
  }
}

In my case, I use a base fixture to extend the Playwright test fixture and include the audioMonitor fixture.

// Extend Playwright's base test to include our audioMonitor fixture
export const test = base.extend<{ audioMonitor: AudioMonitor }>({
  audioMonitor: async ({ page }, use) => {
    const monitor = new AudioMonitor(page)
    await monitor.start()
    await use(monitor)
    // Cleanup is automatic as the window object is destroyed with the page
  },
})

export { expect } from '@playwright/test'

Now use it in your tests:

// tests/audio-feature.spec.ts
import { test, expect } from '../fixtures/audio-fixture'

const SOUNDS = ['sound_1', 'sound_2', 'sound_3', 'sound_4']

test('should play expected sounds when audio is ON', async ({
  page,
  audioMonitor,
}) => {
  await page.goto('/somepage')

  // Trigger action that should play the expected sounds
  await page.getByTestId('somebuttonid').click()

  const { missing } = await audioMonitor.verifyExpectedSounds(SOUNDS)

  expect(missing.length, `Missing sounds: ${missing.join(', ')}`).toBe(0)
})

Pros of this approach:

Fixture-Based Isolation: Unlike singletons, fixtures are local to the test worker. This is critical for parallel execution.
Headless Friendly: This solution doesn't rely on system audio output, making it perfect for CI/CD environments (Docker, GitHub Actions) where no sound card exists.
Zero Dependencies: It leverages native browser APIs and Playwright’s built-in addInitScript and test.extend capabilities.
Timing & Sequence: Because we capture a timestamp for every sound, you can even assert that "Sound A" played before "Sound B."
Traceability: Because these interactions happen in the browser context, these "sounds" will actually show up in your Playwright Console Logs if you add the console.log mentioned above, making the Playwright HTML Report much more descriptive.

Cons of this approach:

While powerful, this method captures the intent to play audio. It doesn't guarantee that the user actually heard it (e.g., if the OS volume is zero or a physical speaker is unplugged). However, for functional E2E verification, ensuring the application code correctly calls the playback APIs is the gold standard for automated testing.

Testing Browser Audio with Playwright

How to test browser audio in Playwright by intercepting HTML5 and Web Audio API calls—ensuring reliable sound tests.

Related Articles

Go's Fx: Dependency Injection To Easily Unlock Your Integration Tests

Measure Website Performance with Playwright & Lighthouse

Storybook Component/Interactions Testing