AI & data analytics
read, write & connect

The data blog that business and technology leaders need to read to fully understand the potential, power AND limitations of data science and data analysis.

Let's go with my opinion if we do not have enough data

Jesús Córdoba

"If we have data, let's look at data. If all we have are opinions, let's go with mine." - Jim Barksdale, former Netscape CEO


Let me start with the classic: "I've been there, I've done that."

People who are undecided between options may flip a coin or use other aids that produce random outcomes to support decision-making. Such aids lead to straightforward suggestions, which people do not necessarily follow.

Instead, when looking at the outcome, individuals sometimes appear to like or dislike the direction and then decide according to this feeling.

An opinion or a coin flip is considered a fair means of making decisions. However, when the outcome is more important, a decisive coin toss becomes less acceptable, as this approach seems to conflict with traditional ideas about argument-based rationality and the personal responsibility of the decision maker.

My fourteen years of experience tell me that most IT executives making a "feeling/toss coin" decision are more likely to change their minds in six months. You must provide hard facts and data to shape thoughts and decisions when proving the need for organizational changes. This is how most of us have come across while influencing change, and these working practices are notoriously difficult. The right decisions backed with good data can support businesses in ways a "good hunch" never could.


Remember, Data tell stories. - Jesús Córdoba


Data is a powerful tool for making good business decisions. It can help you clearly see where you should prioritize your resources and focus your attention.

For being such a simple word, Data is quite a complicated topic. Like "love." Well, not that complicated.

As we enter August, I would like to talk about what Data stands for, the different types, and how executive leaders express and talk about Data terms and types. Let's get started.


Bonus Story

Jim Barksdale was called before Congress several times during hearings about Microsoft and its alleged abuse of its operating system monopoly to dominate the web browser market. At one point he addressed the entire room: "How many of you use Intel-based PCs in this audience, not Macintoshes?" Most people in the room raised their hands. "Of that group who use PCs, how many of you use a PC without Microsoft's operating system?" All of the hands went down. He said to the Senate panel, "Gentlemen, that is a monopoly."

Information VS Data


Let's create a common language. We will start by making a distinction between the two.

  • Information is that portion of the content of a signal or message which conveys meaning. Information is not knowledge itself but rather the representation of it. Information is derived from thinking about something new in our brains.
  • Data is studied in many different fields, which is why many names exist for the same things. A lot depends on the context. However, Data is the smallest unit of factual information that can be used as a basis for calculation, reasoning, or discussion.

There are different data types.

  • Observational
  • Experimental
  • Structured
  • Unstructured.
  • Discrete
  • Continous.

I will be short in the following definitions to make them easy to understand. We will go more into detail in future blog posts.


Data Types

When collecting your data, it is crucial to comprehend the form of your Data for you to interpret and analyze it effectively.

It is essential to identify Data based on differences and similarities.


Numeric and categorical.

Data comes in two flavors: Numeric and Categorical.

Numeric Data is accessible, and its numbers. Categorical Data is everything else.


Discrete and Continuous Data

Discrete Data is a numerical type of data that includes whole, concrete numbers with specific and fixed values determined by counting.

Example:

  • The number of people in one building
  • The number of items you buy at the store


Continuous data includes complex numbers and varying values measured over a specific time interval.

Example:

  • The daily wind speed
  • The temperature of a freezer


Structured and Unstructured Data

Structured Data is comprised of clearly defined data types with patterns that make them easily searchable, while unstructured Data – "everything else" – is comprised of data that is usually not as easily searchable, including formats like audio, video, and social media postings.

Example of structured Data

  • Relational databases (RDBMS).
  • Fields store length-delineated data like phone numbers
  • Social Security numbers
  • ZIP codes.

Example unstructured Data

  • Text files: Word processing, spreadsheets, presentations, emails, logs.
  • Social Media: Data from Facebook, Twitter, and LinkedIn.


Both have cloud-use potential, but structured data allows for less storage space, and unstructured data requires more.


Observational VS Experimental

  1. Observational; Data is collected based on what's seen or heard by a person or computer.
  2. Experimental; Data is collected following the scientific method using a prescribed methodology.


Data "is" VS Data "are"

Data does not always look like a dataset or spreadsheet. It's often in the form of summary statistics.

The three most common summary statistics are

  • Mean
  • Median
  • Mode.

The majority of executive professionals refer to Data as "normal," "usual," "typical," or "average" used as synonyms for each of the terms, and that's incorrect.

This is how you talk as a data executive:

The mean is the sum of all the numbers you have divided by the count of all the numbers.

If you sorted it in order, the median would be the midpoint of the entire data range.

The mode is the most common number in the dataset.

Mean, median, and mode are called measures of location or measures of central tendency. It's a common mistake for people to use the average (mean) to represent the midpoint of the data, which is the median. They assume half of the numbers must be above average and half below. This isn't true. To avoid confusion and misconceptions, we recommend sticking with mean or average, median, and mode for full transparency.

Try not to use words like usual, typical, or normal.

We hope we have provided you with a common language to speak about your data in the workspace. With the correct terminology in place, you are ready to start thinking statistically about the Data you read or come across.


"Statistical thinking is a different way of thinking that is part detective, skeptical, and involves alternate takes on a problem." - Frank Harrell, statistician, and professor