CSE researchers present new findings and tech at UIST 2023

CSE researchers have 2 papers and 4 demos appearing at the conference, covering new tech that improves accessibility, enhances user experience, and helps surgeons-in-training.
A restaurant menu touchscreen interface. The BrushLens software has added a grid with numbers that allows the user to more easily navigate the screen.
A demonstration of how the BrushLens software, developed by CSE researchers and being presented at UIST, helps visually and motor-impaired users navigate touchscreens, such as those increasingly found in restaurants.

Several CSE faculty and students are presenting new research at the 2023 ACM Symposium on User Interface Software and Technology (UIST), a top international forum for new research and innovation in human-computer interaction and interfaces. Hosted by the ACM Special Interest Groups on Computer-Human Interaction (SIGCHI) and Computer Graphics (SIGGRAPH), UIST fosters a unique confluence of disciplines, bringing together researchers from graphical and web user interfaces, tangible and ubiquitous computing, extended and augmented reality, human-centered artificial intelligence, and more.  This year’s iteration of the conference is taking place October 29-November 1, 2023, in San Francisco, CA. 

The research being presented by CSE researchers reflects the diversity of topic areas covered at UIST, including new tech that improves accessibility, such as a new phone case that makes touchscreens more accessible to people with visual or motor impairments, as well as a sound blender that enhances the accessibility of mixed-reality environments. Other CSE innovations include a web automation technique that improves user control and confidence and a mapping tool that supports video-based learning for surgeons.

The two papers and four demos by U-M researchers appearing at UIST are as follows, with authors affiliated with CSE in bold:

BrushLens: Hardware Interaction Proxies for Accessible Touchscreen Interface Actuation (paper & demo)

Chen Liang, Yasha Iravantchi, Thomas Krolikowski, Ruijie Geng, Alanson P. Sample, Anhong Guo

Abstract: Touchscreen devices, designed with an assumed range of user abilities and interaction patterns, often present challenges for individuals with diverse abilities to operate independently. Prior efforts to improve accessibility through tools or algorithms necessitated alterations to touchscreen hardware or software, making them inapplicable for the large number of existing legacy devices. In this paper, we introduce BrushLens, a hardware interaction proxy that performs physical interactions on behalf of users while allowing them to continue utilizing accessible interfaces, such as screenreaders and assistive touch on smartphones, for interface exploration and command input. BrushLens maintains an interface model for accurate target localization and utilizes exchangeable actuators for physical actuation across a variety of device types, effectively reducing user workload and minimizing the risk of mistouch. Our evaluations reveal that BrushLens lowers the mistouch rate and empowers visually and motor impaired users to interact with otherwise inaccessible physical touchscreens more effectively.

You can read more in this recent story about BrushLens.

MIWA: Mixed-Initiative Web Automation for Better User Control and Confidence (paper)

Weihao Chen, Xiaoyu Liu, Jiacheng Zhang, Ian Iong Lam, Zhicheng Huang, Rui Dong, Xinyu Wang, Tianyi Zhang

Abstract: In the era of Big Data, web automation is frequently used by data scientists, domain experts, and programmers to complete time-consuming data collection tasks. However, developing web automation scripts requires familiarity with a programming language and HTML, which remains a key learning barrier for non-expert users. We provide MIWA, a mixed-initiative web automation system that enables users to create web automation scripts by demonstrating what content they want from the targeted websites. Compared to existing web automation tools, MIWA helps users better understand a generated script and build trust in it by (1) providing a step-by-step explanation of the script’s behavior with visual correspondence to the target website, (2) supporting greater autonomy and control over web automation via step-through debugging and fine-grained demonstration refinement, and (3) automatically detecting potential corner cases that are handled improperly by the generated script. We conducted a within-subjects user study with 24 participants and compared MIWA with Rousillon, a state-of-the-art web automation tool. Results showed that, compared to Rousillon, MIWA reduced the task completion time by half while helping participants gain more confidence in the generated script.

DeckFlow: A Card Game Interface for Exploring Generative Model Flows (demo)

Gregory Croisdale, John Joon Young Chung, Emily Huang, Gage Birchmeier, Xu Wang, Anhong Guo

Abstract: Recent Generative AI models have been shown to be substantially useful in different fields, often bridging modal gaps, such as text-prompted image or human motion generation. However, their accompanying interfaces do not sufficiently support iteration and interaction between models, and due to the computational intensity of generative technology, can be unforgiving to user errors and missteps. We propose DeckFlow, a no-code interface for multimodal generative workflows which encourages rapid iteration and experimentation between disparate models. \system emphasizes the persistence of output, the maintenance of generation settings and dependencies, and continual steering through user-defined concept groups. Taking design cues from Card Games and Affinity Diagrams, DeckFlow is aimed to lower the barrier for non-experts to explore and interact with generative AI.

SoundBlender: Manipulating Sounds for Accessible Mixed-Reality Awareness (demo)

Ruei-Che Chang, Chia-Sheng Hung, Dhruv Jain, Anhong Guo 

Abstract: Sounds are everywhere, from real-world content to virtual audio presented by hearing devices, which create a mixed-reality soundscape that entails rich but intricate information. However, sounds often overlap and conflict in priorities, which makes them hard to perceive and differentiate. This is exacerbated in mixed-reality settings, where real-world and virtual sounds can conflict with each other. This may exacerbate the awareness of mixed reality for blind people who heavily rely on audio information in their everyday life. To address this, we present a sound rendering framework SoundBlender, consisting of six sound manipulators for users to better organize and manipulate real and virtual sounds across time and space: Ambience Builder, Feature Shifter, Earcon Generator, Prioritizer, Spatializer, and Stylizer. We demonstrate how the sound manipulators can increase mixed-reality awareness through a simulated working environment, and a meeting application.

SketchSearch: Fine-tuning Reference Maps to Create Exercises In Support of Video-based Learning for Surgeons (demo)

Jingying Wang, Xu Wang, Vitaliy Popov

Abstract: Video-based surgical coaching involves mentors reviewing surgery footage with trainees. Although effective, its use is sporadic due to time constraints. We propose AI-augmented coaching through SketchSearch, allowing experts to create exercises with automated feedback in surgery videos for self-learning. Surgeons often seek specific scenes for teaching, relying on visual cues. SketchSearch simplifies this through a three-step process: key frame extraction, template reference maps creation via image segmentation, and fine-tuning for frame retrieval.