Scientific Datasets

One of the most common tasks for a Strateos user is viewing datasets generated from the experiments we run for them. The data viewing and interpretation tools available to life scientists today are surprisingly limited, and a powerful feature of Strateos is the ability to immediately explore and understand data right in the Strateos web application. However, this experience had suffered neglect and needed UX and engineering improvements.

To set some context on the company

At Strateos, our mission is to turn the life sciences into an information technology, driven by data, computation, and high-throughput robotics with the goal of substantially advancing drug discovery. Our team brings together deep expertise in biology, automation, and software engineering to set new standards for the way scientific research is done.

Problem statement

Strateos offers full transparency into the data captured during an experiment. When we execute a user's experiment, we present the data collected as soon as it's available, in a visually rich format. This immediacy and ability to visually review results is a huge value-add for our users.

However, before this project, the UI architecture for viewing and analyzing datasets on Strateos was crumbling. Our users love having specific data visualizations for different types of experiments, but maintaining these many views was time-consuming and prone to bugs. Dataset views often broke, and adding new views required extensive custom coding.

The as-is experience for viewing datasets prior to this project was prone to bugs and visually messy.

Furthermore, our approach to generating datasets had fundamental flaws. When a user requested to download a dataset, it was generated on the fly using the current state of the system. This caused inconsistencies between downloads and prevented proper auditing – a huge liability.

Finally, the existing interface lacked clean aesthetics and made it difficult for users to quickly interpret their results.

Stakeholders

Working from the initial framing of this project, I identified the following stakeholder groups for which my solution would need to work:

Screen Shot 2022-04-27 at 11.07.28 AM.png

The external users of Strateos hold many different titles, from Lab Managers to Accountants to Scientists. However, Scientists are typically the only members of a team that analyze datasets, so their needs will be the focus of this project.

A good portion of Strateos's daily users are Strateos employees, responsible for customer success, scientific development, or simple lab maintenance. These individuals are frequent consumers of datasets.

As Strateos adds new scientific capabilities, our engineering team is responsible for building out the accompanying dataset visualizations. My proposals need to reduce the engineering team's burden and likelihood of introducing visual bugs.

Jobs to be done

Strateos Head of Product and I developed a single Job To Be Done to guide this project.

A solution that addresses this Job To Be Done will address the individual needs of each stakeholder group.

As a biologist using Strateos, I want to view and access the data generated in an experiment in a single location, so that I can quickly understand if an experiment was successful, and know how I should proceed with my work.

Strateos Head of Product and I developed a single Job To Be Done to guide this project.

A solution that addresses this Job To Be Done will address the individual needs of each stakeholder group.

Research and exploration

User research

With our job-to-be-done in place, I began a round of extensive 1:1 interviews with members of each stakeholder group. From these conversations I identified the following pain points:

Scientists struggle to quickly understand the basic outcome of their experiments, due to a cluttered interface, the need to download data files before viewing them, and changes in display between datasets from different experiments.
Scientists have no way to understand the state of the world when their data was generated, such as how much liquid was in a container when it was measured. This makes it extremely difficult to understand why an experiment might have failed.
When Scientists need to compare values from multiple datasets, their workflow is slow due to a poor navigation experience.
Scientists struggle to access the specific data they need from an experiment, and often end up downloading far more data than necessary.
Internal Scientists struggle to release new features to customers in a timely manner due to the long time required for software engineering to implement new dataset views.
Software Engineers must spend many hours duplicating and modifying UI code from other dataset views in order to implement new ones.

User needs

From these pain points, I narrowed the scope of the project down to four core user needs:

Screen Shot 2022-04-27 at 11.24.25 AM.png

Goals

My task was to provide a re-designed user experience for navigating and accessing datasets. The core requirements were:

Extensible: The proposed solution must work in a variety of contexts, and be easily adaptable to future needs as new dataset types are added.
Audit-able: It should be clear when a dataset was generated and apparent that the data contained within a dataset is static and represents a past state of the system.
Compose-able: Datasets are made up of Data Objects, and this solution should visually mirror that architecture.
Useable: Biologists have well defined approaches to data. Our solution should not dictate a preferred approach but instead meet Biologists where they stand.

Initial sketches

My initial sketches focused on the core components needed to appropriately display data objects, with some basic attention paid to layout. These sketches were foundational to the final architecture proposed.

Architecture Brainstorming

My sketches produced the beginnings of a visual system, which I decided to develop into a full framework from which a multitude of layouts could be composed.

To develop this framework, I detailed all of the data types we supported. Then, I audited the Strateos database to find the following most commonly executed experiments:

AutoPick
Plate Reader
Flow Cytometry
qPCR
Aliquot Measurement

Original brainstorming sketches for development of dataset visual system.

In the interest of solving for the third user need, I drew on my learnings from our scientist users and developed layouts optimized specifically for these common experiment types.

*User Need #3: Biologists need to view datasets in layouts and visualizations specific to the type of experiment that was run.

Final Proposal

Based on this exploration, I proposed a four level hierarchy:

Primitives -> Layouts -> Templates -> Summaries

Primitives

Primitives are the basic visualizations we support. They are the atomic units from which all dataset views can be created. I designed and coded five of them, along with a set of Data Visualizations.

Image

Images of containers allow scientists to understand the physical state of their container.

Code block

JSON snippets can be displayed in a formatted view when no superior visualization is available.

CSV
CSV Data Objects can be displayed along with searching and sorting capabilities.

Plate map
Data Objects that represent single values for wells in a plate can be visualized as a plate map.

Table
Structured data not stored as a CSV is rendered in a table layout.

Data visualization
Often, basic plots and graphs help scientists digest an experiment's results. This set of data visualization primitives addresses the vast majority of our user's needs.

Layouts

Primitives -> Layouts -> Templates -> Summaries

The next level of the hierarchy is Layouts. While a large number of possible arrangements exist within our 12 column grid, I selected a small subset for use in datasets. This allows us to write template components and reduce the engineering overhead of implementing new dataset views.

Screen Shot 2022-05-01 at 6.45.11 PM.png

Templates

Primitives -> Layouts -> Templates -> Summaries

Layouts are used by Templates for clean, consistent content display. I defined four types of templates, each of which leverages one of the predefined layouts shown above. Every Template presents structured data, often previewed with a graphic such as a Plate Map or Data Visualization.

Drawing on my initial visual exploration, I mocked out high-fidelity examples of each template type. Along with each mock, I proposed a JSX structure for rendering it, to ensure that my proposals could be reflected by simple, sparse UI code.

Single data object
Used when no graphic representation of the data object is readily available. This Template is intended to be used sparingly, as the user needs are best addressed with visual representations of data.

Screen Shot 2022-05-01 at 6.49.04 PM.png

Main Graphic with supporting graphics and data

Used when the primary user need is to see visualized data for recognizing patterns. The supporting graphic and data provide more structured views of the same data.

main-graphic-supporting-graphic-structured-data-template.png

Summaries

Primitives -> Layouts -> Templates -> Summaries

Finally, datasets of specific types are represented as a Summary View that leverages one or more of the Templates. A Summary View brings the entire framework together and presents specific layouts for different experiment types, answering for the third user need. The key features of a Summary View are:

Understand the type of instruction the dataset was created from, along with the time it was created, and the robotic infrastructure on which it was created, offering invaluable context to the user.
Quickly download either all of the raw data for a dataset, or single data objects, for use in other analysis tools.
See the details for the physical container from which data was collected. This is critical information that helps scientists contextualize the results.

Result

The nature of this proposal allowed for a healthy development process. While fellow engineers focused on backend modifications, I tackled the implementation of the Primitives and Layout components. This allowed our team to work independently and quickly at first. Then, with the various components in place, we collaborated to tie all the pieces together.

Overall, the project was incredibly well received by internal and external users. We heard from our own employees and our customers that they found their datasets far more approachable, and that they were able to evaluate the outcome of an experiment at a glance – success!

Scientific Datasets

To set some context on the company

Problem statement

Stakeholders

Jobs to be done

Research and exploration

User research

User needs

Goals

Initial sketches

Architecture Brainstorming

Final Proposal

Primitives

Image

Code block

CSV CSV Data Objects can be displayed along with searching and sorting capabilities.

Plate map Data Objects that represent single values for wells in a plate can be visualized as a plate map.

Table Structured data not stored as a CSV is rendered in a table layout.

Data visualization Often, basic plots and graphs help scientists digest an experiment's results. This set of data visualization primitives addresses the vast majority of our user's needs.

Layouts

The next level of the hierarchy is Layouts. While a large number of possible arrangements exist within our 12 column grid, I selected a small subset for use in datasets. This allows us to write template components and reduce the engineering overhead of implementing new dataset views.

Templates

Single data object Used when no graphic representation of the data object is readily available. This Template is intended to be used sparingly, as the user needs are best addressed with visual representations of data.

Main Graphic with supporting graphics and data Used when the primary user need is to see visualized data for recognizing patterns. The supporting graphic and data provide more structured views of the same data.

Summaries

Result

CSV
CSV Data Objects can be displayed along with searching and sorting capabilities.

Plate map
Data Objects that represent single values for wells in a plate can be visualized as a plate map.

Table
Structured data not stored as a CSV is rendered in a table layout.

Data visualization
Often, basic plots and graphs help scientists digest an experiment's results. This set of data visualization primitives addresses the vast majority of our user's needs.

Single data object
Used when no graphic representation of the data object is readily available. This Template is intended to be used sparingly, as the user needs are best addressed with visual representations of data.

Main Graphic with supporting graphics and data

Used when the primary user need is to see visualized data for recognizing patterns. The supporting graphic and data provide more structured views of the same data.