Data Science

Data science is an interdisciplinary study of data using advanced tools and techniques to detect patterns, extract information, and inform organization decisions.

Data science increasingly uses machine learning techniques to develop predictive models for testing and analyzing volumes of data.

Why use data science?

Data science was born from a need to analyze unstructured data. Insights and trends collected can inform considerations and decisions.

Structured data filling rows and columns like Excel or Google Sheets was easy for entering, searching, comparing, extracting, and analyzing data in the past. The 1990s brought a move to predominantly unstructured or semi-structured data. Unstructured data includes business documents, emails, social media, customer feedback, webpages, open-ended survey responses, images, audio, and video.

Data scientist vs. data analyst

ScientistAnalyst
“What is the data?”“What does the data tell us?”
Focused on machine learning and algorithmsFocused on business administration
Developing operational modelsPre-processing and data gathering
In-depth programming knowledgeScripting and statistical skills

How is data science used?

Data science techniques reveal insights that inform decisions. Deciding to produce more of a specific product, building an office in a new location, altering the email marketing campaign with more CTAs (call-to-actions); if the data shows justification, the company can decide knowing where the data pointed.

Data science continues to evolve and hone its ability to test, manipulate, and utilize data from unstructured and semi-structured volumes.

Examples of data science

  • Text analysis
  • Mention mining
  • Biometric analysis

Analyzing Text

Text analysis is the method of analyzing unstructured and semi-structured text for business insights. Be it five thousand customer surveys or three years of invoices, the application of data science to text analysis proves to outperform humans in less time and resources.

Mining for Mentions

On social media, mentions of organizations, brands, and products can inform digital marketing strategy. With data science applied, text analysis and machine learning can automate user insights on social media.

Utilizing Biometrics

In an age of protecting human identities and the data associated with them, biometric analysis uses image, video, sensor, and biometric data to authorize users. From opening a smartphone to fingerprint identification and behavioral analysis, evaluating all of the unstructured data today would be impossible without data science.

History of data science

Data science came about as a term during the development of computers in the second half of the 20th century. Computer science was an upstart field of thought, while statistics had been a millennium in the making. For years, a debate over renaming fields of study ensued, but data science failed to catch fire until the 2000s.

Early timeline of data science as a term

Year Moment
1962 Mathematician John Tukey proposes a field of study called data analysis.
1974 Computer scientist Peter Naur proposes data science replace the term computer science.
1985 Staticiation Chien-Fu Jeff Wu proposes data science replace the term statistics.
1990s Knowledge discovery and data mining are used to describe data science.
1996 The International Federation of Classification Societies features data science as a topic.
1997 C.F. Jeff Wu proposes data science replace the term computer science (again).
1998 Hayashi Chikio proposes data science contain three aspects: data design, collection, and analysis.

 

Sam Ingalls
Sam Ingalls
Sam Ingalls is an award-winning writer and researcher covering enterprise technology, cybersecurity, data centers, and IT trends, for Webopedia, eSecurity Planet, ServerWatch, and Channel Insider.

Related Articles

Defense Advanced Research Projects Agency (DARPA)

The Defense Advanced Research Projects Agency (DARPA) is a research and development agency of the United States Department of Defense (DOD). The agency stands...

Merkle Tree

Merkle trees—or hash trees—are cryptographic algorithms allowing for the efficient validation of large data structures and are critical to the development of secure computing...

Raspberry Pi

Raspberry Pi (RP) is the educational charity and namesake line of microcomputers offering customers a low-cost, single-board computer for lightweight computing purposes. While Raspberry Pi...

Data Modeling

Data modeling (sometimes spelled data modelling) is the analysis of data objects and their relationships to other data objects. Data modeling is often the...

Agile Project Management

Agile project management enables business teams to approach their projects and tasks with...

Private 5G Network

A private 5G network is a private local area network (LAN) that utilizes...

Rich Communication Services (RCS)

Rich communication services (RCS) is a mobile messaging approach in which session initiation...