Dr. Joy Buolamwini on Machine Learning
Algorithmic Justice, Accountability, and Inclusive AI
Dr. Joy Buolamwini, computer scientist and founder of the Algorithmic Justice League, has shown the world that machine learning systems are never neutral. They reflect the data, design choices, and power structures that shape them. In Unmasking AI, she demonstrates how “accuracy” alone is insufficient: models can perform well on majority groups yet fail dramatically for those underrepresented. Her research on facial analysis revealed error rates disproportionately higher for darker-skinned women compared to lighter-skinned men, reframing bias as both a technical risk and a civil rights issue.
Our project, focused on American Sign Language (ASL) recognition, draws inspiration from Buolamwini’s ideals. We aim to create algorithms that give mute and deaf individuals a “voice” in spaces where interpreters are unavailable, helping them communicate more freely in everyday and professional life. This project is not only technical but also ethical: it embodies the principle that technology should empower marginalized communities rather than exclude them.
Project Goals
- Accessibility: Enable communication for deaf, mute, and partially hearing individuals without requiring an interpreter.
- Inclusivity: Design algorithms that recognize diverse hand positions and variations, ensuring learners and non-standard users are not penalized.
- Justice in Technology: Embed Buolamwini’s call for fairness, transparency, and accountability into every stage of development.
Incorporating Buolamwini’s Ideals
Buolamwini prescribes a lifecycle approach to accountability: audit models before and after deployment, document datasets, measure performance across demographic slices, and involve impacted communities. We incorporated these principles in several ways:
- Dataset Diversity: Instead of privileging background or skin tone, we trained the model on varied hand positions.
- Transparency: We documented training decisions, including which letters were attempted, which hand positions were included, and which limitations were encountered.
- Community-Centered Design: Though early in development, we envision involving deaf and mute individuals directly in testing and feedback.
- Performance Across Variations: By including non-standard hand placements, we ensured the system could serve learners and those with physical differences.
These decisions were deliberate: we prioritized the diversity of hand shapes and positions over background uniformity, recognizing that users will inevitably present a wide range of lighting conditions, skin tones, and camera qualities. The goal was to reduce exclusion and improve real-world usability, even at the prototype stage.
Technical Process
We used Google’s Teachable Machine to prototype recognition. The tool allowed us to train on snapshots of hand shapes but not motion. We tested multiple hand positions for letters at the beginning of the alphabet, but the model became too large to export fully. Despite these constraints, we prioritized inclusivity over efficiency.
This mirrors Buolamwini’s lesson that justice must guide design choices. Even when technical limitations exist, we chose to emphasize accessibility for diverse users rather than optimizing for speed or compactness. Our documentation includes the classes trained, dataset counts per class, and environmental conditions (lighting/background). This recordkeeping supports future audits and collaborative improvement.
However, we did encounter some difficulties. For instance, the model kept registering the person doing the signs as the main target, rather than the signs themselves. This would create racial biases and other biases that apply toward someone who does not look like the subject. Therefore, we switched to using online images to train it with just the hand signs without the person in the background. Although, even then, our machine has not been that accurate, and we had some limitations on how many letters we can train it on, which is unfortunately only four (A, B, C, D). Additionally, we must train it more since we did not input many images for it.
Limitations and Future Directions
- Motion Recognition: Teachable Machine cannot capture dynamic signs, limiting effectiveness for most ASL words.
- Model Size: Large datasets became difficult to save and export.
- Scope: Only a subset of letters was tested due to technical constraints.
These limitations highlight Buolamwini’s point that accuracy in controlled conditions is not enough. Real-world utility requires robustness, scalability, and inclusivity. Future iterations should move beyond snapshot-based tools toward platforms capable of video and temporal recognition, such as sequence models or transformer-based video encoders.
Lessons from Buolamwini's Works
- Bias is Structural, Not Accidental: Just as facial recognition systems failed for darker-skinned women, ASL recognition could fail for users with non-standard hand positions or different skin tones.
- Accuracy does not equal Justice: A model that recognizes signs perfectly for one group but fails for another is not “successful.”
- Transparency Builds Trust: Documenting datasets and training decisions ensures accountability and allows others to critique and improve the system.
- Community Participation is Essential: Impacted groups must guide design. For ASL recognition, deaf and mute individuals should shape priorities and evaluate usability.
- Accountability Across the Lifecycle: Pre-deployment audits, ongoing evaluation, and post-deployment monitoring are required to catch drift and inequities.