WebsiteGear Logo Log In
New User? Sign Up
About | Contact | FAQ
  Home News Web Hosting Computer Hardware Thursday, April 27, 2017 
Add Press Release News | News Feeds Feeds | Email This News Email

Training Data And Ethical Issues Surrounding Artificial Intelligence Top of Mind For Data Scientists
Thursday, April 20, 2017

Third annual data scientist survey finds data scientists would rather break leg than lose the training data used to teach AI systems to think.

SAN FRANCISCO, April 20, 2017 /PRNewswire/ -- Artificial Intelligence systems may one day drive a car, help cure diseases, or simply choose the movie we watch at night. However, the work to get there is tedious and time consuming according to the just completed third annual Data Scientist Survey conducted by CrowdFlower, the essential human-in-the-loop AI platform for data science and machine learning teams.

The survey of nearly 200 data scientists found that the jobs they hate the most - cleaning, labeling and categorizing data are where they have to spend the most time. For example, data scientists spend 500% more time cleaning, labeling and categorizing data than they spend mining the data. In fact, those surveyed said they spend double the amount of time on these laborious task than creating and building algorithms.

The reason? It is twofold. First, the lack of high quality training data is the single biggest reason AI systems fail according to the results of the survey. In fact - it is so critical, respondents said they'd rather break their leg than delete their training data. Secondly, data scientists have concerns about the integrity of the training data and worry that if they aren't careful, the wrong training data could bias an AI system because it could be influenced by human prejudices around things such as religion, race or gender.

"There is a tremendous amount of hard work that is needed to make an AI system deliver on its promise and at the core is getting the training data right," said Robin Bordoli, CEO of CrowdFlower, "Cleaning, labeling and categorizing data isn't sexy or fun, but it's critical. Data scientists know it and that's why they are spending the bulk of their time doing the work they hate. The reality is that algorithms are far from perfect, however, with higher quality training data - created by human intelligence - we can generate business value even with these imperfect algorithms."

As AI systems increasingly enter the mainstream, their usefulness is often defined by the quality of the training data used. While a machine can process complex mathematical equations or structured data in milliseconds, training data teaches a machine how to process more abstract data like flagging inappropriate content or distinguishing between objects in images. While higher quality initial training data will improve the accuracy of an algorithm's initial output, ongoing training data is required to constantly improve upon the algorithm's results.

Among the other insights gleaned from AI experts:

    --  Ethical issues: AI and ethics is an issue that bears close watch in the
        coming years.  While the potential of AI replacing human-staffed jobs is
        an issue according to 42% of respondents, the biggest issues in their
        eyes is the impact of human bias in training data.  More than 63% of
        those surveyed said that they are concerned that human bias and
        prejudices such as race, religion or demographics will corrupt the data
        used to teach AI systems. Another 42% express skepticism that we can
        avoid the programming of biases and are concerned about the
        'impossibility of programming a commonly agreed upon moral code.'
    --  Job satisfaction: Data scientists love their jobs, even if they hate the
        grunt work.  More than 90% of those surveyed said they were happy doing
        their jobs. In fact, nearly 50% said they were thrilled.  Additionally,
        63% of those surveyed agree with the oft-quoted moniker that data
        scientist is the sexiest job in the industry.
    --  Demand for data scientists: While the field of data science is still
        pretty new, there is no question that the job market for data scientists
        is red hot.  Even though the majority of respondents have only been in
        the jobs less than 5 years, they are getting called all the time about
        new opportunities.  Over half of the respondents are contacted at least
        once per week with a job offer and nearly 30% receive calls multiple
        times each week.

To view the full report, please visit:

About CrowdFlower
CrowdFlower is the essential human-in-the-loop AI platform for data science teams. CrowdFlower helps customers generate high quality customized training data for their machine learning initiatives, or automate a business process with easy-to-deploy models and integrated human-in-the-loop workflows. The CrowdFlower software platform supports a wide range of use cases including self-driving cars, intelligent personal assistants, medical image labeling, content categorization, customer support ticket classification, social data insight, CRM data enrichment, product categorization, and search relevance.

Headquartered in San Francisco and backed by Canvas Venture Fund, Trinity Ventures, and Microsoft Ventures, CrowdFlower serves data science teams at Fortune 500 and fast-growing data-driven organizations across a wide variety of industries. For more information, visit

To view the original version on PR Newswire, visit:

SOURCE CrowdFlower

Email This News Email | Submit To Slashdot Slashdot | Submit To Digg | Submit To | News Feeds Feeds

Nav Qualcomm Announces Top Eight Finalists for Cycle I of Qualcomm Design in India Challenge II | Apr 26, 2017
Nav Innodisk's Latest SSD Hits the Sweet Spot | Apr 26, 2017
Nav Advanced Semiconductor Engineering, Inc. Files 2016 Annual Report on Form 20-F | Apr 26, 2017
Nav Peachtree Hotel Group Selects SS&C Precision LM(TM) to Automate and Enhance its Hotel Lending Systems | Apr 26, 2017
Nav Interactive Projectors Market Size to Reach $4.56 Billion by 2025: Grand View Research, Inc. | Apr 26, 2017
Nav Research and Markets - Global Nuclear Medicine Imaging Equipment Market 2015-2022:Profiles of Leading Players Including Digirad, GE Healthcare, Mediso Medical Imaging Systems, Philips Healthcare, Positron and Siemens | Apr 26, 2017
Nav Single Board Computer (SBC) Market Growing at a CAGR of 8.61% During 2017 to 2021 Says a New Report at | Apr 26, 2017
Nav AgJunction Sets First Quarter 2017 Conference Call for Thursday, May 11, 2017 at 11:00 a.m. ET | Apr 26, 2017
Nav Chewing Gum Industry Disrupted | Apr 26, 2017
Nav SnoopWall Named One of the Top 25 Cyber Security Companies for 2017 | Apr 26, 2017

Submit News | View More News View More News