Setting up a database:
- Create a database
- Create a user
- Create a table
- Create a column
- Create indexes
- Add foreign keys
Using the database:
- INSERT
- SELECT
- UPDATE
- DELETE
Learn how to build and use a common MariaDB (or MySQL) database with examples for phpMyAdmin as well.
Setting up a database:
Using the database:
ALTER TABLE tbl_name ADD col_name data_type NOT NULL;
Where tbl_name
is the name of the table, col_name
is the name of the column and data_type
is the data type as described below.
If one wish to allow NULL
values, one has to remove the NOT
keyword. It is desirable to not use NULL
values if not necessary, as they take up storage space (where an empty values doesn't).
Storage Requirement (bytes) |
Integer | Real | Date | String 11 |
---|---|---|---|---|
1 | BOOLEAN |
DECIMAL (1, [0-1]) |
YEAR |
CHAR (1) 12 |
TINYINT |
DECIMAL (2, [even]) 1 |
VARCHAR ([1-255]) |
||
TINYTEXT |
||||
2 | SMALLINT |
"EXACT REAL(2)" 2 | CHAR (2) 12 |
|
DECIMAL (3, [0-3]) |
VARCHAR ([256-16 383]) 13 |
|||
DECIMAL (4, [even]) 3 |
TEXT |
|||
3 | MEDIUMINT |
"EXACT REAL(4)" 2 | DATE |
CHAR (3) 12 |
DECIMAL (5, [0-5]) 4 |
TIME |
MEDIUMTEXT |
||
DECIMAL (6, [even]) 5 |
||||
4 | INTEGER |
"EXACT REAL(6)" 2 | TIMESTAMP |
CHAR (4) 12 |
DECIMAL (7, [0-7]) 6 |
LONGTEXT |
|||
DECIMAL (8, [even]) 7 |
||||
DECIMAL (9, [0,9]) 8 |
JSON 14 |
|||
FLOAT |
||||
5 | "EXACT REAL(9)" 2 | CHAR (5) 12 |
||
DECIMAL (10, [0-2,4,6,8-10]) 9 |
||||
DECIMAL (11, [0,2,9,11]) |
||||
6 | "EXACT REAL(11)" 2 | CHAR (6) 12 |
||
DECIMAL (10, [3,5,7]) 9 |
||||
DECIMAL (11, [1,3-8,10]) |
||||
DECIMAL (12, [0-4,6,8-12]) |
||||
DECIMAL (13, [0,2,4,9,11,13]) |
||||
7 | "EXACT REAL(13)" 2 | CHAR (7) 12 |
||
DECIMAL (12, [5,7]) |
||||
DECIMAL (13, [1,3,5-8,10,12]) |
||||
DECIMAL (14, [0-6,8-14]) |
||||
DECIMAL (15, [0,2,4,6,9,11,13,15]) |
||||
8 | BIGINT |
"EXACT REAL(15)" 2 | DATETIME |
CHAR (8) 12 |
DECIMAL (14, 7) |
||||
DECIMAL (15, [1,3,5,7,8,10,12,14]) |
||||
DECIMAL (16, [0-16]) |
||||
DECIMAL (17, [0,2,4,6,8,9,11,13,15,17]) 10 |
||||
DECIMAL (18, [0,9,18]) 10 |
||||
DOUBLE |
||||
25 | POINT 15 |
|||
NOTES:
|
Unless there is a very specific need, you probably need a data type taking less than 5 bytes. Higher than 8 bytes, it is most likely too much and it is a good indicator that there is probably another solution better suited for your needs.
Always prefer an INTEGER
type for best performance.
If you need a DATETIME
(8 bytes), you may prefer a DATE
and TIME
separately (3 bytes each) and a separate VIRTUAL
column for DATETIME
(no storage), like so:
ALTER TABLE table_name ADD date_name DATE NOT NULL, ADD time_name TIME NOT NULL AFTER date_name, ADD datetime_name DATETIME AS (ADDTIME(date_name, time_name)) VIRTUAL AFTER time_name;
In addition to the storage gain, this may also help create better indexes for some queries based on time or date alone.
TIMESTAMP
is generally used to track record changes (INSERT
& UPDATE
). Prefer DATETIME
to store specific values (a birthday or a hiring date, for example).
If you don't need to search within a string, or use it for ordering or grouping, you might prefer using an integer referring to a list in a pre-compiled code instead of storing it into the database. With the same reasoning, you shouldn't use the database for internalization either; There are other methods to do this that most likely offer better advantages.
Prefer CHAR
when the string length is always the same and when you can use a character set with 1 byte per character (a serial number for example).
Prefer VARCHAR
for word-like strings that make sense as a whole (a username or a URL, for example) and TEXT data types for sentences that you might need to parse for specific words (a comment or an article for example). Technically, VARCHAR
takes more memory than TEXT data types when in use, but is faster because there is less overhead.
TEXT data types also make more sense if the strings are only stored and retrieved (i.e. they never appear in WHERE
, GROUP BY
or ORDER BY
clauses).
Try not to go over a length of 255 for VARCHAR
, such that only 1 byte is used for storing the string length (2 bytes otherwise). Anyway, there are very few «word-like» strings above 255 characters, it is thus a good indicator that a TEXT data type might be better suited.
ENUM
and SET
When you look at the MariaDB (or MySQL) documentation, you will note two additional string data types that were not specified in the above table, namely ENUM
and SET
. There is a good reason for it: Don't use them! Why? Because they are not string data types, they are INTEGERs, and even MariaDB treats them as such. You can see them as constant, as defined in other programming languages - for example, const int STRING1 = 1;
or final int string1 = 1;
. You wouldn't refer to «string1» anywhere other than within your code, because «string1» is a constant name, not data. If you need an INTEGER, use an INTEGER. The only thing ENUM
and SET
add is a correspondent string value to the integer, which is something you can do elsewhere, perhaps in another table, that may or may not be used with a JOIN
and linked with a FOREIGN KEY
. But if you define it outside the database, at the very least, adding a comment within the database is a good idea to keep track of things.
The way to use an INTEGER as an ENUM
is simply to make a one-to-one correspondence with each string (1 => "string1"; 2 => "string2"; ...). And for a SET
, it is only storing an array of INTEGERs (See below for arrays). MariaDB actually uses the bitwise technique for SET
.
Anyway, a program well designed would only care for a «state» (represented by the integer value) and would only seek for a «human readable» value in its last step, when presenting data. Manipulating the «human readable» value for the different operations is not a good idea and often pointless, especially when all humans may not use the same language (now or in the future).
Integers are great, but they have limited range and precision, and they don't store real values. There are basically 3 data types that can take care of this: FLOAT
, DOUBLE
and DECIMAL
. DECIMAL
(M, D) is declared with 2 numbers: a precision (M) - the quantity of significant digits in a number - and a decimal (D) - i.e. the quantity of digits after the decimal point. As one can see in the table above, choosing the wrong DECIMAL
can have a great impact on the storage requirement.
Let us show what precision represents. 123.45 has 5 significant digits, 12340000 has 4 significant digits and 0.0123456 has 6 significant digits. Basically, it means that the numbers could represent exact values like 123.4500258 or 12339021.243 or 0.0123456481, but the measuring instruments couldn't be more precise than 4, 5 or 6 digits.
It is easy to estimate the maximum error on a number with a given precision. The error is always ±0.5 on the last digit. Thus, for any value of x:
Precision | Minimum Number | Maximum Error | Maximum Precision Error |
---|---|---|---|
1 | 1e10x | 0.5e10x | 50 % |
2 | 1.0e10x | 0.05e10x | 5 % |
3 | 1.00e10x | 0.005e10x | 0.5 % |
4 | 1.000e10x | 0.0005e10x | 0.05 % |
5 | 1.0000e10x | 0.00005e10x | 0.005 % |
6 | 1.00000e10x | 0.000005e10x | 0.0005 % |
7 | 1.000000e10x | 0.0000005e10x | 0.00005 % |
10 | 1.000000000e10x | 0.00000000005e10x | 5e10-8 % |
15 | 1.000000000000000e10x | 0.0000000000000005e10x | 5e10-13 % |
18 | 1.000000000000000000e10x | 0.0000000000000000005e10x | 5e10-16 % |
n | 1e10x | 0.5e10(x-n) | 5e10(2-n) % |
Ordinary measuring instruments give a precision of about 7 digits (a clock that can loose 1 second every year, for example). The most precise measuring instruments made by mankind can go to about an 18-digit precision (an atomic clock that looses 1 second every 14 billion years, for example). Expecting needing a higher precision is foolish.
That being said, FLOAT
can provide 7-digit precision (4 bytes), DOUBLE
can provide 15-digit precision (8 bytes), and DECIMAL
can go as high as 65 digits (29-30 bytes). But as seen in the table above, with only 8 bytes of storage space, you can use 16-digit precision DECIMAL
(even, up to 18 with certain decimal values). Since the storage requirement of a DECIMAL
is closely related to its precision (M), anything above 18 or so is a waste of storage memory.
Also, with FLOAT
and DOUBLE
data types, you get either a 7 or 15 -digit precision, with 4 or 8 bytes storage requirement, with nothing in between. DECIMAL
can fill that void.
The other limit for a real number is the range.
FLOAT
can store values from -3.402823466E+38 to -1.175494351E-38, 0 and from 1.175494351E-38 to 3.402823466E+38.
DOUBLE
can store values from -1.7976931348623157E+308 to -2.2250738585072014E-308, 0 and from 2.2250738585072014E-308 to 1.7976931348623157E+308.
For DECIMAL
, the range is less wide and depends on the declared precision and decimal. For example, a DECIMAL
(7,3) would have a range between -9999.999 and 9999.999. You can get a range similar to FLOAT
by using DECIMAL
(38,0) for values greater than 1 or DECIMAL
(38,38) for values less than 1. To cover approximately the full range of FLOAT
, you need something like DECIMAL
(65,32).
FLOAT
and DOUBLE
seem to be the best choices because of the extended range with no more memory needed than with an equivalent DECIMAL
. However, they have a serious problem with accuracy. Even if you can count on them to have 7 or 15 digits of precision, after some calculations, they may show more random digits than can cause unexpected behavior. For example, multiplying 2 FLOAT
s with value of 0.1 together to find a value x, you will end up with 0.010000000298023226 as a result (8-digit precision). If you decide to use this result in (x > 0.01), you will get TRUE, even though the expected result is FALSE. DOUBLE
will offer a greater precision, but the result will still not be accurate.
On the other hand, DECIMAL
offers perfect accuracy, over its range and precision.
You can sometime save storage space by using integers instead of decimals. For example, say you want to store numbers that will range between 0.00000 and 40000.00000. The obvious choice seems to be DECIMAL
(10, 5), which requires 6 bytes of storage space. However, you can multiply your number by 100 000 and store it in an INTEGER UNSIGNED
instead, which requires only 4 bytes of memory. The drawback is a slower response as you need to do calculations every time you insert or retrieve a value. The most practical way is by using a VIRTUAL
column, which will not be stored:
ALTER TABLE table_name ADD integer_name INT UNSIGNED NOT NULL, ADD decimal_name DECIMAL(10,5) AS (integer_name / 100000) VIRTUAL AFTER integer_name;
So you will find the correct value in the VIRTUAL
column.
Similarly, you can store an «EXACT REAL» by storing the significand and the exponent of a floating number, and use a VIRTUAL
column as DECIMAL
. To store something similar to a FLOAT
, one could do the following:
ALTER TABLE table_name ADD significand_name DECIMAL(7,0) NOT NULL, ADD exponent_name TINYINT NOT NULL AFTER significand_name, ADD exact_real_name DECIMAL(65,32) AS ( -- Convert the base and the exponent to DECIMAL IF(exponent_name >= 0, -- Add zeros as necessary to string '1' and convert to DECIMAL CAST(CONCAT('1',REPEAT('0',exponent_name)) AS DECIMAL(65,0)), -- Add zeros as necessary to string decimal and convert to DECIMAL CAST(CONCAT('0.',REPEAT('0',ABS(exponent_name) - 1),'1') as DECIMAL(38,38)) ) * -- Multiply by the significand (converted to DECIMAL) CAST(significand_name as DECIMAL(7,0)) ) VIRTUAL AFTER exponent_name;
This will give a range of -9.999 999e32 to -1e-32, 0 and 1e-32 to 9.999 999e32, with only 5 bytes of storage space. But, as oppose to FLOAT
, if you multiply 2 of these EXACT REALs with value of 0.1 together, you will end up with 0.01000000000000000000000000000000000000 instead of 0.010000000298023226.
Do not use the built-in function POW() to do the transformation as it will return an inaccurate double precision number. The values used in the transformation must be DECIMALs such that the result is also a DECIMAL.
Here's what a function in PHP might look like to get the significand and the exponent from a number to fit in the columns of our example:
$number = 123456789; $exponent = floor(log10(abs($number))) - 6; $significand = $number / pow(10, $exponent); // $significand = 1234567.89 // $exponent = 2; // $significand will be automatically rounded // to nearest DECIMAL(7,0) by MariaDB upon INSERT or UPDATE
Storing latitude and longitude coordinates can easily be done with 7-digit precision INTEGERs or DECIMAL
data types.
Unfortunately, these data types can only be used with 1-D indexes, namely B-tree data structure. You are much better off with SPATIAL INDEX
, which uses a R-tree data structure, specifically designed for sorting multi-dimensional data. The data type needed for this is POINT
which stores both variables with double precision (and some other information), which leads to a 25 bytes storage requirement. That is the trade-off you have to make to be able to use the right tools.
There are no «array» types in MariaDB. Here are the main methods to represent arrays.
The most versatile way is to store the values in another table instead of a column. For example, say you have a table person like this one:
person_id | first_name | last_name | date_of_birth |
---|---|---|---|
22 | John | Smith | 1970-jan-01 |
36 | Jane | Doe | 1977-sep-12 |
Now, you would like to add a column pet to represents the pets owned by each person. Because each person can have more than one pet, it is preferable to create a table pet instead and a reference table person_pet linking both tables together:
pet_id | name | race |
---|---|---|
1 | Mittens | cat |
2 | Fido | dog |
3 | Buster | dog |
person_id | pet_id |
---|---|
22 | 1 |
22 | 3 |
36 | 2 |
By appropriately joining the tables (for example, person JOIN person_pet USING(person_id) JOIN pet USING(pet_id)
),you can get all pets with their owner's information.
If pet would not be used anywhere else than with table person, than all the columns found in table pet can be put into person_pet (except pet_id, which is not required anymore), to eliminate one table and one JOIN
operation.
It is good practice to name the reference table with the «main» table name as the prefix. person_pet means «pet is an array "column" for table person». That way, when listed alphabetically, all reference tables will be following their respective «main» table (in a program like phpMyAdmin or the like, for example). It is a lot easier to follow, especially when you have many reference tables linked to a single «main» table.
It is also good practice to name the ID column with the table name as the prefix, as shown above. This makes things a lot easier to follow when using the ID in more than one table. This also let us use the form table_1 JOIN table_2 USING(table_1_id)
where MariaDB knows that table_1_id found in both tables refers to the same value and only returns one copy, if asked to. You can even use the simpler form table_1 NATURAL JOIN table_2
, where MariaDB looks automatically for the column names (there can be more than one, so be careful) shared by both tables.
If you have multiple choices between a given set of values (equivalent to checkboxes within a form), you can use this technique that only uses a single INTEGER column, thus very efficient speed- and memory-wise. This is the method used with the SET
data type.
Logically link an integer with each possible value, say a string, but only with numbers containing only one bit set to "1" in binary form:
Integer | Binary | Value |
---|---|---|
1 ( = 20 ) | 00000001 | "string1" |
2 ( = 21 ) | 00000010 | "string2" |
4 ( = 22 ) | 00000100 | "string3" |
8 ( = 23 ) | 00001000 | "string4" |
... | ... | ... |
That way, the integer 00000101 (= 5) actually means "string1" and "string3" are selected. Of course, with BIGINT
- with its maximum value of (264 - 1) - there are only 64 possible choices.
You can then use bit functions to look for desired values.
There are JSON functions available in MariaDB to manipulate JSON objects. So if you store a valid JSON object into any string data type (JSON
data type exists for compatibility with MySQL, but it really is a LONGTEXT
), you can use those functions to manipulate the data.
But remember that it might not extend as easily as reference tables do, if you wish to link data stored in JSON objects to other tables in the future.