An AI tool meant to reduce bias in investing could exacerbate it, experts say

CB Insights, a market intelligence firm with a client list of household names, bills the system as a “credit score, but for startup teams.”
article cover

Dianna "Mick" McDougall, Getty Images/Freder

· 9 min read

“Credit score, but for startup teams.”

That’s how CB Insights markets Management Mosaic, its new prediction algorithm for scoring early-stage founders and management teams, to help the company’s roster of 1,000+ clients including Cisco, Salesforce, and Sequoia expedite investment, purchasing, and M&A decisions.

The algorithm’s basic premise is simple: Input a startup team’s background (résumé milestones and other criteria), and out comes a prediction of their “success” likelihood—implying how good an investment the individuals themselves might make.

Management Mosaic’s other stated goal is to help corporate decision-makers—think: VCs, executives, and company boards—avoid “subjective vetting,” or applying human biases to investment decisions.

“What used to require secret handshakes, network-based intros, and face-to-face coffee conversations has been simplified dramatically with Management Mosaic,” Anand Sanwal, CB Insights’ cofounder and CEO, told us. “Our aim is to make the assessment of teams infinitely more efficient and objective.”

But when Emerging Tech Brew spoke with AI activists, public-interest technologists, and some in the venture capital world, many shared a similar concern about the algorithm: Much like the credit scores referenced in the product’s pitch, which have been shown to perpetuate economic inequality and limit opportunities for communities of color, experts worry the CB Insights tool could do the same in exacerbating existing biases and widening the wealth gap for marginalized groups.

The algorithm is trained on so-called signals of success, according to the company—from a founder’s educational institution to their “network quality.” But because much of the data is historical, it could be particularly prone to bias. For example, just 1% of venture-backed startup founders are Black—and for Black women specifically, it’s less than 1%.

Sasha Costanza-Chock, director of research and design for the Algorithmic Justice League, a research and advocacy organization founded in 2016, believes that if CB Insights’ description of Management Mosaic is accurate, then the company’s approach to predicting startup success without bias is “extremely naive.”

“The company seems to be building a tool which is going to shape the industry—and help determine which startups get resources and which ones are successful,” Costanza-Chock said. “But by relying on past historical datasets that we know are deeply discriminatory to inform people about where they should invest, they risk automating already discriminatory investment patterns.”

When asked about Management Mosaic’s potential for bias, Sanwal told Emerging Tech Brew that since race, gender, and other classes aren’t included in the training data, the algorithm “doesn’t really care” about demographics. He said it also tends to weigh more recent outcomes more heavily.

“It’s just...taking this incredible array of dimensions and just sort of looking at, objectively, who would do well,” Sanwal added. “The results so far just sort of indicate that none of those factors are driving or having any kind of result here.”

CB Insights confirmed the tool is not third-party-reviewed, that it did not go through an audit or impact assessment, and that the company did not bring in an AI ethicist. The company said it uses Shapley values, a concept used in game theory, to determine the extent to which different factors impact a startup team’s score.

When asked if the company thought about access and privilege in looking at success predictors, and if anyone had been tasked with thinking about bias as the model was being built, Sanwal said, “The best way for us to impact that is that the model continues to learn.” He added that, as the composition of tech company founding teams change, “the model will start to reflect that over time.”

“Based on public information about how they score founders and management teams, if everybody starts using their tool, then founding teams led by historically marginalized people, such as Black people, women, or gender nonconforming folks, and/or people with disabilities, may receive low scores and lose investors,” Costanza-Chock said. “Far from leveling the playing field, their tool may help ensure that a future world of potential parity never happens."

Clients had been requesting a tool to gauge startup teams’ success potential for years, Sanwal said, and now, many clients are taking in Management Mosaic data via CB Insights’ API, pairing it with their own data, and building their own models and CRMs.

Sanwal told us the algorithm contributed to its “biggest Q3 ever.” The company claimed that “Fortune 500 clients” and “leading venture firms” are already using the tool, and that three out of four clients have interacted with Management Mosaic–based research or data since the product’s release. Sanwal and CB Insights did not share additional metrics supporting these claims.

The company is reportedly set to make nearly $100 million in annual recurring revenue next year—and hasn’t ruled out a 2022 IPO.

Disparate impact

Though Sanwal is insistent that the tool is blind to demographics, experts say that’s simply not how ML algorithms work.

To build Management Mosaic, which it began work on in late 2020, CB Insights chose companies with outsized returns in terms of noteworthy M&As, IPOs, and significant VC funding, then pulled data straight from the management teams’ résumés—including past companies, how early they were hired at a successful company, level of education, which college or university they attended, and “network quality based on other people they’re likely to know.”

Keep up with the innovative tech transforming business

Tech Brew keeps business leaders up-to-date on the latest innovations, automation advances, policy shifts, and more, so they can make informed decisions about tech.

The training data doesn’t include race, gender, or other demographic data, according to Sanwal, but some of these categories serve as proxies for protected classes. (Proxies are variables that aren’t directly related to a protected class but can still be correlated.)

For instance, Sanwal told us that although Management Mosaic didn’t flag founders with MBAs as a positive signal, it did consider startup CTOs with MBAs to be one—as well as early-stage hires at successful companies (e.g., Uber’s 50th product manager versus its 500th). But just 8% of MBA graduates are Black, and historically, early-stage hires at tech giants have skewed toward white or Asian males. Less than 2% of enterprise software startups in the US have a female founder, according to one recent report.

“Disparate impact is unintentional discrimination, where the policy or practice does not take race or sex or any of the protected characteristics into account, but one of those groups is being disproportionately affected,” Aaron Konopasky, a senior attorney-advisor in the EEOC’s office of legal counsel, told us. “In the case of AI, the demographic information itself doesn’t have to appear in the algorithm for that particular test to come out potentially discriminatory.”

In 2017, for instance, Amazon retired a recruiting algorithm because it had penalized résumés that included the word “women’s” and learned to demote graduates of some all-women’s colleges. And this year, both federal agencies and local governments began to inspect hiring algorithms more closely to ensure that algorithms used in hiring comply with federal civil rights laws.

Four out of the six publicly disclosed “signals” that CB Insights uses to inform an individual’s likelihood of success—including past employers, educational institution, educational attainment, and “network quality”—are proxies for legally protected categories, since they’re factors shaped by race, socioeconomic status (SES), gender, and disability, according to the AJL’s Costanza-Chock.

“There is no evidence to suggest bias in the results,” Sanwal told us. “The alternatives to date—low-tech networking and intros—are costly in terms of both time and money. These methods also result in today’s decisions being focused on too few criteria from a biased sample of available information (i.e., your current network). Management Mosaic has the benefit of scale in that we analyze and synthesize massive amounts of data that highlight complex, nonlinear relationships. This helps clients see companies and founders they’d have not seen before and helps reduce individual bias.”

Sanwal added that the company will continue to adjust the algorithm based on new factors and data, as well as customer feedback.

"It’s hard to imagine how a score, CB Insights’ algorithm, will not just reinforce pre-existing biases further under a pretense of being bias-free—while at the same time reducing the the sense of responsibility that investors should be doing to check for biases in their deal flow and investment pipelines,” Wilneida Negrón, cofounder of the Startups & Society Initiative, a nonprofit think tank focused on accelerating responsible tech company building practices, told us.

The tool was also built, in part, using public money: CB Insights received a total of $1.15 million in grants from the National Science Foundation for its precursor project called Mosaic, between 2010 and 2014, but the NSF does not require an algorithmic impact assessment to secure funding.

Deena Shakir, a partner at Lux Capital, told us that while she hasn’t used Management Mosaic, she’s often skeptical about what’s marketed as data-driven decision-making. And for Gabe Kleinman, who leads portfolio services and marketing at Obvious Ventures, a San Francisco–based VC firm, transparency is important for any sort of algorithmic decision-making, including training data and the attributes it weights most heavily.

“There are a number of examples of well-intended products that have been built with the intention of removing human bias from a number of processes...and we have often seen how even the best-intended products can sometimes have the opposite effect,” Kleinman said. “Everyone should ask: ‘What could possibly go wrong?’ I would be curious to know how this team approached consequence scanning, or if they conducted any sort of pre-mortem with that question in mind.”

Costanza-Chock said that kind of analysis should be done by an objective third party, instead of via the Silicon Valley grade-your-own-homework structure.

“We need transparency and disclosure of [a] team’s internal bias audits, if they have any, but we also need independent, third-party audits, from regulators, investigative journalists, and/or independent researchers, to audit and analyze their claims.” Costanza-Chock added. “Only then will we know whether CB Insights’ ‘algorithmic shortcut’ to the ‘best startups’ is really just shorthand for the same old sexist, racist Silicon Valley investment patterns."

Keep up with the innovative tech transforming business

Tech Brew keeps business leaders up-to-date on the latest innovations, automation advances, policy shifts, and more, so they can make informed decisions about tech.