SQLizer Logo

Easily convert files into SQL databases

Text Handling in SQLizer

SQLizer will analyse the length of all the text values in your source file and size the database columns accordingly. This pages explains this process in more detail.

Consider the following CSV, the Values column has text values with various lengths in it.

ID,Values
1,foo
2,The quick brown fox jumped over the lazy dog
3,bar

SQLizer will size the 'Values' column with a width of 44 on MySQL and SQL Server. On PostgreSQL, the column type of TEXT will be used.

CREATE TABLE IF NOT EXISTS text (
    `ID` INT,
    `Values` VARCHAR(44) CHARACTER SET utf8
);
INSERT INTO text VALUES
    (1,'foo'),
    (2,'The quick brown fox jumped over the lazy dog'),
    (3,'bar');
IF NOT EXISTS (
    select * from sysobjects where name='text' and xtype='U'
) CREATE TABLE text (
    [ID] INT,
    [Values] NVARCHAR(44)
);
INSERT INTO text VALUES
    (1,N'foo'),
    (2,N'The quick brown fox jumped over the lazy dog'),
    (3,N'bar');
CREATE TABLE IF NOT EXISTS "text" (
    "ID" INT,
    "Values" TEXT
);
INSERT INTO "text" VALUES
    (1,'foo'),
    (2,'The quick brown fox jumped over the lazy dog'),
    (3,'bar');
CREATE TABLE IF NOT EXISTS text (
    ID INTEGER,
    Values TEXT
);
INSERT INTO text VALUES
    (1,'foo'),
    (2,'The quick brown fox jumped over the lazy dog'),
    (3,'bar');

TEXT Columns in MySQL

On MySQL, the VARCHAR datatype is the most useful column type for storing text, but its length counts towards a maximum row size of 65,535 bytes. To avoid producing a SQL script that will fail, SQLizer will estimate the size of the rows in your table and convert text columns to TEXT, MEDIUMTEXT or LONGTEXT data types in order to ensure the row will fit within the size limit (starting with the longest columns first). For exmaple, consider the following CSV:

ID,ShortText,LongText
1,foo,abc
2,bar,Lorem ipsum... 65535 bytes of text ...end
3,baz,xyz

The overall length of the row will cause the longest column ('LongText' in this instance) to be converted into a MEDIUMTEXT type:

CREATE TABLE IF NOT EXISTS long_text (
    `ID` INT,
    `ShortText` VARCHAR(3) CHARACTER SET utf8,
    `LongText` MEDIUMTEXT CHARACTER SET utf8
);
INSERT INTO text VALUES
    (1,'foo','abc'),
    (2,'bar','Lorem ipsum... 65535 bytes of text ...end'),
    (3,'baz','xyz');

For columns under 64000 bytes in length the TEXT type will be used. For columns longer than 16000000 bytes, the LONGTEXT type will be used.

SQL Server NVARCHAR Length

On SQL Server databases, the NVARCHAR datatype is used. If the maximum length of all text values is under 4000 characters, the column is sized using that limit. However, if the maximum length of all text values is over 4000 characters, the NVARCHAR(MAX) data type is used.