First tests with SQL Server vNext on Linux and Windows
The IT world is changing. The major players are changing. The old rules are evolving and the old barriers between “gardens” are disappearing. Microsoft is changing, but it is not alone. The first preview of Microsoft SQL Server vNext for Linux debuted in November 2016. On the same day, Microsoft became a platinum member of the Linux Foundation. And on the same day Google joined the .NET Foundation Technical Steering Group (Google announcement here).
SQL server vNext on Linux and Windows
I am too curious, so I immediately created a couple of VM on Azure to try the new SQL Server on Linux, Ubuntu 16.04, and on Windows Server 2016. The VMs are D2_V2 (2 core, 7 GB ram). Both virtual machines run on a standard storage account, i.e. traditional hard disks.
The setup on Linux was straightforward and fast. Starting from a running Ubuntu 16.04 and following the documentation, in less than 10 minutes I have got a running Sql Server. The installation is done through APT. Before starting, it is necessary to update the package repositories list and the a simple command installs SQL Server:
sudo apt-get install -y mssql-server
After that, another command allows choosing the SA password and setup the start-up mode of the sql service:
sudo /opt/mssql/bin/sqlservr-setup
That’s all. SQL Server is running on Linux! But bear in mind that SQL Server for Linux lacks a lot of features compared to the full version for Windows. Details here. You should also install the tools sqlcmd and bcp for Linux.
On Windows Server 2016 the installation was much longer: the iso or cab needed to be downloaded first. The well-known (old?) graphical setup wizard has many options and executes a lot of checks during the setup. It required more than 40 minutes to complete.
Connecting from a remote machine
I managed both the SQL servers directly from my PC through the internet using the SQL Management Studio (it is recommended to upgrade to the latest version. Previous version sometime hangs managing Linux machines).
To connect to the remote machines, I needed to add a security rule to allow inbound TCP traffic on port 1433 on both virtual machines. Important note: you should restrict the allowed source to your public IP. It is not a good idea to expose a sql tcp port to the full internet.
On Windows Server 2016 it is also required a similar rule on the internal firewall. On the other hand, Linux is already open to accept connections (perhaps too much open, in my opinion).
Using the latest SQL Server Management Studio, I created an empty database on both the VMs. A funny behavior: the paths from the Linux machine are always prefixed with a “C:”, so the resulting path is something like: C:\var\opt\mssql\data\mytestdb.mdf
Restore a database
I downloaded the sample AdventureWork2014 backup from Microsoft site and restored it on both VMs. The command was the same:
RESTORE DATABASE [AdventureWorks2014]
FROM DISK = N'C:\Fabry\sql_backup\AdventureWorks2014.bak'
WITH FILE = 1,
MOVE N'AdventureWorks2014_Data'
TO N'C:\Fabry\mssql_data\AdventureWorks2014_Data.mdf',
MOVE N'AdventureWorks2014_Log'
TO N'C:\Fabry\mssql_data\AdventureWorks2014_Log.ldf',
NOUNLOAD, STATS = 5
RESTORE DATABASE [AdventureWorks2014]
FROM DISK = N'C:\home/Fabry/db_backup\AdventureWorks2014.bak'
WITH FILE = 1,
MOVE N'AdventureWorks2014_Data'
TO N'C:\var\opt\mssql\data\AdventureWorks2014_Data.mdf',
MOVE N'AdventureWorks2014_Log'
TO N'C:\var\opt\mssql\data\AdventureWorks2014_Log.ldf',
NOUNLOAD, STATS = 5
Some performance tests
AdventureWorks2014 database is not so big for a D2_V2 virtual machine. So I decided to “fatten” it using the script from sqlskills.com. It creates two new big tables, Sales.SalesOrderHeaderEnlarged and Sales.SalesOrderDetailEnlarged.
The first run of the script was done without changing the predefined auto-growth of the database files. The resulting mdf was 1.5 GB and the LDF was 5 GB. A lot of time was consumed by allocating new disk space. This is the normal behavior of sql server and both the versions behaved the same way. I did a second run, after removing the two new tables. As expected it was faster.
First run | Second run | |
---|---|---|
Windows | 00:10:19 | 00:03:44 |
Linux | 00:05.06 | 00:02:31 |
Linux was faster than Windows on both cases. It is surprising. I expected SQL server on Linux slower than Windows. I repeated the tests and the result are the same.
I did other tests running different queries and Linux was always faster. I repeated the test after couple of days. The VMs were shutdown, deallocated and restarted. The results were more or less the same. I am still investigating about these strange results.
Windows | Windows | Linux | Linux | |
---|---|---|---|---|
Query | Cpu time | Elapsed time | Cpu time | Elapsed time |
(0) | 57000 | 34000 | ||
(1) | 4329 | 2667 | 2919 | 6037 |
(2) | 4390 | 2287 | 2344 | 1316 |
(3) | 56092 | 38323 | 31683 | 21553 |
(4) | 3063 | 1877 | 1713 | 862 |
(milliseconds)
Query_0
SELECT count_big(1) FROM
HumanResources.Department as T1, HumanResources.Department as T2,
HumanResources.Department as T3, HumanResources.Department as T4,
HumanResources.Department as T5, HumanResources.Department as T6,
HumanResources.Department as T7, HumanResources.Department as T8
Query_1
CHECKPOINT
DBCC DROPCLEANBUFFERS
SELECT distinct
sales.SalesOrderHeaderEnlarged.salesorderid,
Production.Product.Name
from Sales.SalesOrderHeaderEnlarged
inner join Sales.SalesOrderDetailEnlarged
on Sales.SalesOrderHeaderEnlarged.SalesOrderID =
Sales.SalesOrderDetailEnlarged.SalesOrderID
inner join Production.Product
on Production.Product.ProductID =
Sales.SalesOrderDetailEnlarged.ProductID
where linetotal > 20000
Query_2
...like Query_1 but without CHECKPOINT and DBCC DROPCLEANBUFFERS
Query_3
SELECT count(1)
FROM [Sales].[SalesOrderDetail] as SOD1,
[Sales].[SalesOrderDetail] as SOD2
WHERE SOD1.ProductID=SOD2.ProductID
AND SOD1.UnitPrice > SOD2.UnitPrice
Query_4
SELECT distinct [CarrierTrackingNumber]
FROM [Sales].[SalesOrderDetailEnlarged]
[UPDATE 2016-12-04]
Following the notes from Marco Russo, I started a larger session of tests using different machine size on Azure. The results are very contradictory and inconsistent, in particular using virtual machine of the DS series. Different runs in different time in the day, showed different results. I can only guess about the reasons (CPU overcommitment?). On the other hand, DSv2 series showed a more stable behavior. The table below shows the results.
Windows | Linux | Linux VS Windows | |||||
---|---|---|---|---|---|---|---|
VM size | query | cpu time | elapsed time | cpu time | elapsed time | cpu time | elapsed time |
D2_V2 | 0 | 61594 | 36302 | 60037 | 34298 | -2.53% | -5.52% |
D2_V2 | 1 | 2687 | 4475 | 2448 | 3126 | -8.89% | -30.15% |
D2_V2 | 2 | 2891 | 1827 | 2304 | 1266 | -20.30% | -30.71% |
D2_V2 | 3 | 33531 | 23454 | 30760 | 20974 | -8.26% | -10.57% |
D2_V2 | 4 | 1984 | 1008 | 1649 | 830 | -16.89% | -17.66% |
D11_V2 | 0 | 63078 | 37782 | 56302 | 32051 | -10.74% | -15.17% |
D11_V2 | 1 | 3296 | 4276 | 2532 | 3915 | -23.18% | -8.44% |
D11_V2 | 2 | 2796 | 2135 | 2110 | 1536 | -24.54% | -28.06% |
D11_V2 | 3 | 35234 | 24252 | 29466 | 20013 | -16.37% | -17.48% |
D11_V2 | 4 | 1953 | 982 | 1507 | 759 | -22.84% | -22.71% |
DS2_V2 | 0 | 57593 | 32205 | 58225 | 33492 | 1.10% | 4.00% |
DS2_V2 | 1 | 1954 | 6339 | 2889 | 6016 | 47.85% | -5.10% |
DS2_V2 | 2 | 2359 | 1368 | 2343 | 1340 | -0.68% | -2.05% |
DS2_V2 | 3 | 30844 | 21008 | 30984 | 20937 | 0.45% | -0.34% |
DS2_V2 | 4 | 1687 | 844 | 1578 | 795 | -6.46% | -5.81% |
DS11_V2 | 0 | 57047 | 31664 | 60381 | 34344 | 5.84% | 8.46% |
DS11_V2 | 1 | 2094 | 6325 | 2818 | 6048 | 34.57% | -4.38% |
DS11_V2 | 2 | 2296 | 1274 | 2271 | 1242 | -1.09% | -2.51% |
DS11_V2 | 3 | 30078 | 20419 | 31596 | 21454 | 5.05% | 5.07% |
DS11_V2 | 4 | 1672 | 851 | 1614 | 816 | -3.47% | -4.11% |