From math to SQL
Set theory — from pure math to SQL
This article intends to explain the fundamental theory applied in SQL, probability and statistics.
The Set Theory is one way to explain how different elements are distributed within groups, either sharing more than one group or none. That being said, the theory quantifies the possibilities of clustering those elements. There are practically an infinity number of applications. You might use it for social bubbles, profile statistical surveys, books or products classifications or even propositional logic. Besides, of course, the fact that it is the fundamental for the Strutucted Query Language (SQL) and for the “sets” concept in Python.
- Venn Diagram
- Elements belonging to sets
- Intersection
- Union
- When one set contains another (set operations)
- Review
- References
1. Venn diagram
The so called Venn Diagram mathematically illustrates the association between two or more sets that may or may not have elements in common.
2. Elements belonging to sets
On the above illustration, the elements a1 and a2 belong to set A, while b1 and b2 belong to set B. The math notation for this association example can be defined as bellow:
a1, a2 ∈ A
b1, b2 ∈ B
3. Intersection
Notice that both a2 and b2 belong to sets A and B. However, these two elements are in a common area. As for a1 and b1, each one belongs to a specific set. We might say that the area where a2 and b2 are found is a third set. This common area is called intersection, that being said:
a2, b2 ∈ A ∩ B
meaning that a2 and b2 belong to the intersection between A and B.
4. Union
When we talk about the union between two sets we are basically talking about those elements that belong to either one of the two sets.
The union between sets A and B is the set with elements {a1, b1, a2, b2}. In other words, as well as the idea of the intersection between sets results in a third set, the union of sets also results in a new set. The math notation for this is:
{a1, a2, b1, b2} ∈ (A ∪ B)
Then, we can define the following sets C and D:
C = (A ∩ B)
D = (A ∪ B)
The set C consists on the intersection of A and B. As for set D, it consists on the elements that belong to the union of A and B. Formally, we say that C is the A intersection B and D is equal A union B.
5. When one set contains another (set operations)
Let’s take a look on the following image:
On the above diagram we have the sets A and B, with their respective elements a1 and b1.
We say that set B is a subset of A (or A contains B)
B ⊂ A
All elements belonging to B also belong to A, but not every element in A belong to B.
From this example, we can extract the following properties:
A ∩ B = B
A ∪ B = B
b1 ∈ A, B
Using the concept of set operations, getting back to the Venn Diagram from item 4, we can conclude that:
C ⊂ A, B
A, B ⊂ D
In other words, the set C is contained, simultaneously, in A and B, while the set D contains both A and B. As a consequence, the elements from C belong to sets A and B, although the elements belonging to A or B belong to D.
6. Review
The set theory is an excellent start for those who want to understand a little further about probability and SQL programming.
In this article, we saw that the intersection between two or more sets is linked to the idea of its elements belonging. We also saw that the union between two sets consists in a set that contains all their elements.
Although we used two sets in this article, the same ideas can be extended to 3, 4, …, n different sets.
There’s a caveat here:
The union between two or more sets is not the same as the sum of its elements, although intuitively it makes sense. If we add the elements of A and B, as a result we’ll have the duplicity of the elements that belong to their intersection.
We need to remember this concept, specially if we’re talking about probability.
References:
Stay connected
- Connect on LinkedIn.