PySimpleGUI and NVDA Voicemeeter

October 24, 2023

As mentioned in the previous post Voicemeeter Accessibility for the Blind I chose to work with PySimpleGUI when developing the NVDA Voicemeeter application.

For the following reasons:

A chance to work with a new framework.
I'm somewhat familiar with Tkinter (one of the frameworks PySimpleGUI is based on)
It's use of standard Python types to abstract away from geometry managers.
It's use of a messaging system for it's event loop.
The speed at which you can throw up simple ideas into workable GUIs.

To give a quick example of what I mean I'll borrow this snippet from the docs.

import PySimpleGUI as sg

layout = [
    [sg.Text("What's your name?")],
    [sg.Input(key="-INPUT-")],
    [sg.Text(size=(40, 1), key="-OUTPUT-")],
    [sg.Button("Ok"), sg.Button("Quit")],
]

window = sg.Window("Window Title", layout)

while True:
    event, values = window.read()
    if event == sg.WINDOW_CLOSED or event == "Quit":
        break

    window["-OUTPUT-"].update(f"Hello {values['-INPUT-']}!")

window.close()

Which produces the following window:

PySimpleGUI example

As you can see, the code closely resembles the GUI that it represents. Where you would typically place a widget onto a row or column, with PySimpleGUI you can instead place them into lists, or lists of lists.

In the NVDA Voicemeeter codebase I was able to make this idea scale by creating a Builder class with steps defined as methods and then calling each step in turn. For example, when laying out the Hardware Input buttonmenus I did the following:

    def make_tab0_row0(self) -> psg.Frame:
        """tab0 row0 represents hardware ins"""

        def add_physical_device_opts(layout):
            devices = util.get_input_device_list(self.vm)
            devices.append("- remove device selection -")
            layout.append(
                [
                    psg.ButtonMenu(
                        f"IN {i + 1}",
                        size=(6, 3),
                        menu_def=["", devices],
                        key=f"HARDWARE IN||{i + 1}",
                    )
                    for i in range(self.kind.phys_in)
                ]
            )

        hardware_in = []
        [step(hardware_in) for step in (add_physical_device_opts,)]
        return psg.Frame("Hardware In", hardware_in)

Where a list used to represent the layout was passed to a builder method which in turn placed each ButtonMenu element sequentially.

I was then able to make this scale further using the same idea for each tab. Importantly this gave me the freedom to structure dynamically, according to each kind of Voicemeeter the precise layout of the rows.

    layout0 = []
    if self.kind.name == "basic":
        steps = (
            self.make_tab0_row0,
            self.make_tab0_row1,
            self.make_tab0_row5,
        )
    else:
        steps = (
            self.make_tab0_row0,
            self.make_tab0_row1,
            self.make_tab0_row2,
            self.make_tab0_row3,
            self.make_tab0_row4,
            self.make_tab0_row5,
        )
    for step in steps:
        layout0.append([step()])

Next I'll talk a bit about the event loop. Unlike other frameworks I've worked with, PySimpleGUI events are not based on callbacks but instead an event loop message queue. Specifically, by initiating a while loop and evaluating the result of the read() method on the main window object we receive event data represented by an event string and a values dictionary. Like so:

    while True:
        event, values = self.read()
        self.logger.debug(f"event::{event}")
        self.logger.debug(f"values::{values}")
        if event in (psg.WIN_CLOSED, "Exit"):
            break

This gave me the idea to employ the pyparsing library. It describes itself as an alternative approach to creating and executing simple grammars, vs. the traditional lex/yacc approach, or the use of regular expressions. Since I already had a good idea what the event identifiers would look like (Channel type, index, property type and so on), I figured this was an ideal approach to parsing the event loop. By defining a parser that could split the widget type from the parameter it represents and the event that triggered it, I was able to parse events such as this:

    case [["BUS", index], [param], ["KEY", "SPACE" | "ENTER"]]:
        if param == "MODE":
            util.open_context_menu_for_buttonmenu(self, f"BUS {index}||MODE")
        else:
            self.find_element_with_focus().click()

This for example allowed me to define the action taken when space or enter were pressed on any element representing a Bus class parameter.

All in all I was pleased with my choice to investigate the PySimpleGUI library. It let me spend more time focusing on the functionality and less time thinking about layouts and callbacks.

The only roadblock I did come across were the ButtonMenu elements. By default I was unable to open the context menus with a keyboard, only with a mouse. After reaching out to the PSG devs they were able to inform me that by modifying the underlying Widget object I could make ButtonMenus focusable by a keyboard.

    buttonmenu_opts = {"takefocus": 1, "highlightthickness": 1}
    for i in range(self.kind.phys_in):
        self[f"HARDWARE IN||{i + 1}"].Widget.config(**buttonmenu_opts)

I will finish off by talking about the NVDA controller client. The api it presents is only small, exporting just four functions:

/* [comm_status][fault_status] */ error_status_t __stdcall nvdaController_testIfRunning( void);

/* [comm_status][fault_status] */ error_status_t __stdcall nvdaController_speakText( 
    /* [string][in] */ const wchar_t *text);

/* [comm_status][fault_status] */ error_status_t __stdcall nvdaController_cancelSpeech( void);

/* [comm_status][fault_status] */ error_status_t __stdcall nvdaController_brailleMessage( 
    /* [string][in] */ const wchar_t *message);

The one I was most concerned with was nvdaController_speakText but nonetheless since the API was so small I decided to define bindings for all four functions and present them in a wrapper class in Python:

class CBindings:
    bind_test_if_running = libc.nvdaController_testIfRunning
    bind_speak_text = libc.nvdaController_speakText
    ...

    def call(self, fn, *args, ok=(0,)):
        retval = fn(*args)
        if retval not in ok:
            raise NVDAVMCAPIError(fn.__name__, retval)
        return retval


class Nvda(CBindings):
    @property
    def is_running(self):
        return self.call(self.bind_test_if_running) == 0

    def speak(self, text):
        self.call(self.bind_speak_text, text)
    ...

This allowed me to add auditory feedback on both Focus In events and parameter changes caused by user input. An example of this, when focusing on a tabgroup:

    case ["CTRL-TAB"] | ["CTRL-SHIFT-TAB"]:
        self["tabgroup"].set_focus()
        self.nvda.speak(f"{values['tabgroup']}")

This was a fairly extensive task since by default NVDA screen reader was unable to recognise any of the elements that PySimpleGUI presents.

This is the first time I've attempted to develop an accessible app. I'm pleased with the result and very grateful for the support and feedback I received during development.

Since writing the GUI a few people have reached out to express their gratitude. My aim from the beginning was to help those who find navigating Voicemeeter troublesome, so if this tool assists them then it was all worth the effort.

Further notes:

As of July 2024 PySimpleGUI is no longer open source and requires a license which can be purchased from their site.
An open source fork has been created which offers all the code up until the last LGPL3 commit, check it out at the FreeSimpleGUI repository

Subscribe to this blog's RSS feed