SQLizer will analyse the length of all the text values in your source file and size the database columns accordingly. This pages explains this process in more detail.
Consider the following CSV, the Values column has text values with various lengths in it.
ID,Values
1,foo
2,The quick brown fox jumped over the lazy dog
3,bar
SQLizer will size the 'Values' column with a width of 44 on MySQL and SQL Server. On PostgreSQL, the column type of TEXT will be used.
CREATE TABLE IF NOT EXISTS text (
`ID` INT,
`Values` VARCHAR(44) CHARACTER SET utf8
);
INSERT INTO text VALUES
(1,'foo'),
(2,'The quick brown fox jumped over the lazy dog'),
(3,'bar');
IF NOT EXISTS (
select * from sysobjects where name='text' and xtype='U'
) CREATE TABLE text (
[ID] INT,
[Values] NVARCHAR(44)
);
INSERT INTO text VALUES
(1,N'foo'),
(2,N'The quick brown fox jumped over the lazy dog'),
(3,N'bar');
CREATE TABLE IF NOT EXISTS "text" (
"ID" INT,
"Values" TEXT
);
INSERT INTO "text" VALUES
(1,'foo'),
(2,'The quick brown fox jumped over the lazy dog'),
(3,'bar');
CREATE TABLE IF NOT EXISTS text (
ID INTEGER,
Values TEXT
);
INSERT INTO text VALUES
(1,'foo'),
(2,'The quick brown fox jumped over the lazy dog'),
(3,'bar');
On MySQL, the VARCHAR
datatype is the most useful column type for storing text, but its length
counts towards a maximum row size of 65,535 bytes. To avoid producing a SQL script that will fail, SQLizer
will estimate the size of the rows in your table and convert text columns to TEXT
,
MEDIUMTEXT
or LONGTEXT
data types in order to ensure the row will fit within the
size limit (starting with the longest columns first). For exmaple, consider the following CSV:
ID,ShortText,LongText
1,foo,abc
2,bar,Lorem ipsum... 65535 bytes of text ...end
3,baz,xyz
The overall length of the row will cause the longest column ('LongText' in this instance) to be converted
into a MEDIUMTEXT
type:
CREATE TABLE IF NOT EXISTS long_text (
`ID` INT,
`ShortText` VARCHAR(3) CHARACTER SET utf8,
`LongText` MEDIUMTEXT CHARACTER SET utf8
);
INSERT INTO text VALUES
(1,'foo','abc'),
(2,'bar','Lorem ipsum... 65535 bytes of text ...end'),
(3,'baz','xyz');
For columns under 64000 bytes in length the TEXT
type will be used. For columns longer than
16000000 bytes, the LONGTEXT
type will be used.
On SQL Server databases, the NVARCHAR
datatype is used. If the maximum length of all text
values is under 4000 characters, the column is sized using that limit. However, if the maximum length
of all text values is over 4000 characters, the NVARCHAR(MAX)
data type is used.