2

Data Analysis and Manipulation

In this chapter we are going to delve into list operations, essential for data manipulation and for understanding how Mathematica works; show how to import and export data and files in different formats; have our first contact with the statistical analysis of data; and see how to connect to big databases using ODBC.

2.1 Lists

The basic structure in Mathematica is the list. The elements of a list can be numbers, variables, functions, strings, other lists or any combination thereof; they are written inside braces {} and are separated by commas. If we feel comfortable with the classical terminology of mathematics we can interpret a simple list as a vector, a two-dimensional list (a list of sublists) as a matrix, and a list of three or more dimensions as a tensor, keeping in mind that the tensor elements can be sublists of any dimension.

It’s important to distinguish operations between lists and note that they don’t behave in the same way as operations between matrices, especially with respect to multiplication.

To extract a list element based on its position i in the list we use Part[list, i] or, in compact form [[i]].

In the case of a list of sublists, first we identify the sublist and then the element within the sublist.

The solutions (out) are often displayed in list format. If we need a specific part of the solution, we can use the previous command to extract it.

Other functions very useful for list manipulation are Flatten, Take and Drop.

Some functions, when applied to lists, operate on the individual list elements, e.g. f[{a, b, ...] → {f[a], f[b], ...}. In this case the function is said to be Listable.

There are many more functions used to manipulate lists. We recommend you consult the documentation: guide/ListManipulation.

The most recent versions of Mathematica, starting with Mathematica 10, include datasets (Dataset). These constructions can be considered an extension to the concept of lists. With this new addition, the program has a very powerful way to deal with structured data.

2.2 Importing/Exporting

A fundamental feature of Mathematica is its capability for importing and exporting files in different formats. The commands we need for that are Import and Export. They enable us to import or export many different types: XLS, CSV, TSV, DBF, MDB, ...

In this section we are going to use, among others, the files available as examples in a subdirectory named ExampleData created during the program installation. Some of these examples are downloaded from www.wolfram.com (Internet connection required).

A file can be imported or exported by typing the complete source or destination path.

Import["C:\\Users\\guill\\Google Drive\\Mathbeyond\\Data\\file.xls"]

If we want to use the working directory in which the files are located for importing or exporting purposes, normally the easiest way will be to create a subdirectory inside the directory where the notebook is.

In the directory above, create a subdirectory named Data.

We make the previous cell an Initialization Cell to ensure its evaluation before any other cell in the notebook by clicking on the cell marker and choosing from the menu bar: Cell ▶Properties ▶Initialization Cell). This means that all the data will be imported/exported from the Data subdirectory. We may even consider “hiding” the initialization cell by moving it to the end of the notebook. Remember that the order of operations is not the same as it appears on screen. It is the order in which the cells are evaluated. Therefore, even if an initialization cell is located at the end it will still be the first one to be executed.

2.2.1 Importing

If your computer uses a non-American configuration remember that Mathematica uses the dot “.” as a decimal separator and the comma “,” to distinguish between list elements. If you have another configuration you may have problems when importing files. The program enables the configuration of the format in Edit ▶ Preferences... ▶ Appearance to suit the user needs. Nevertheless, it is better to modify the operating system global settings so that all programs use the same setup. In Windows that can be done by selecting in the Control Panel “.” as decimal separator and the comma “,” as list separator. In OS X, the same can be done by going to System Preferences... ▶ Language & RegionAdvanced...

  • Next, we import an xls file containing data about the atomic elements. Note that the program can figure out the type of file from its extension. In this case it’s an xls file, and the program identifies it as a Microsoft® Excel file.

    Import ["ExampleData/elements.xls"]

    {{{AtomicNumber, Abbreviation, Name, AtomicWeight},
      {1., H, Hydrogen, 1.00793}, {2., He, Helium, 4.00259},
      {3., Li, Lithium, 6.94141}, {4., Be, Beryllium, 9.01218},
      {5., B, Boron, 10.8086}, {6., C, Carbon, 12.0107},
      {7., N, Nitrogen, 14.0067}, {8., O, Oxygen, 15.9961},
      {9., F, Fluorine, 18.9984}}}

  • Notice that the previous output has the following structure {{{...}}}. The reason is that Mathematica assumes the file is an Excel spreadsheet that may contain several worksheets. Using {“Data”, 1} we tell the program to import the data from worksheet 1 (do not confuse this Data with the working subdirectory Data that we have created earlier):

    examplexls=Import["ExampleData/elements.xls", {"Data", 1}]

    {{AtomicNumber, Abbreviation, Name, AtomicWeight},
     {1., H, Hydrogen, 1.00793},
     {2., He, Helium, 4.00259}, {3., Li, Lithium, 6.94141},
     {4., Be, Beryllium, 9.01218}, {5., B, Boron, 10.8086},
     {6., C, Carbon, 12.0107}, {7., N, Nitrogen, 14.0067},
     {8., O, Oxygen, 15.9961}, {9., F, Fluorine, 18.9984}}

  • We show below two ways to extract the first 5 rows of the worksheet and display them in a table format.

    TableForm[Take[examplexls, 5]]

    AtomicNumber  Abbreviation  Name        AtomicWeight
    1.            H             Hydrogen    1.00793
    2.            He            Helium      4.00259
    3.            Li            Lithium     6.94141
    4.            Be            Beryllium   9.01218

    TableForm[examplexls [[;;5]]]

    AtomicNumber  Abbreviation  Name        AtomicWeight
    1.            H             Hydrogen    1.00793
    2.            He            Helium      4.00259
    3.            Li            Lithium     6.94141
    4.            Be            Beryllium   9.01218

  • Here we see how to import a single column. In this example, we get the fourth column containing the atomic weights.

    Import["ExampleData/elements.xls", {"Data", 1, All, 4}]

    {AtomicWeight, 1.00793, 4.00259, 6.94141,
     9.01218, 10.8086, 12.0107, 14.0067, 15.9961, 18.9984}

  • We use Rest to remove the first sublist containing the headings (the same could have been done using Drop [data, 1 ]).

    data = Rest[Import["ExampleData/elements.xls", {"Data", 1}]];

  • We show the imported data. Tooltip is included so that when the cursor is over a point in the graphic, its corresponding atomic weight can be seen.

    ListPlot[Tooltip[data[[All, 4]]],
     Filling → Axis, PlotLabel → "Atomic Weight"]

    Images

  • In our next example we find out the elements available in general for this type of format. It refers to all the possible elements. It doesn’t mean that this file contains all of them.

    Import["ExampleData/elements.xls", "Elements"]

    {Data, FormattedData, Formulas, Images, Sheets}

  • We choose “Sheets” that returns the number of sheets that the file contains including the labels for each one. In this case it only contains one.

    Import["ExampleData/elements.xls", "Sheets"]

    {Spreadsheet1}

  • To display data in a typical spreadsheet format, like Excel, we can use TableView, a builtin function that as of Mathematica 11.0 remains undocumented.

    TableView[Import["ExampleData/elements.xls", {"Data", 1}]]

    Images

Note: Undocumented functions in Mathematica are not guaranteed to be included in future releases of the program. Additionally, some users might experience software stability issues when evaluating them. For these reasons, we recommend caution when using them.

  • We can use DateListPlot, a very useful function when the values in the x-axis are dates.

    DateListPlot[
     Import["ExampleData/financialtimeseries.csv"], Filling → Axis,
     PlotLabel → "Financial Data Time Series", ImageSize → Medium]

    Images

  • Try to copy and paste links to any of the numerous images available in: earthobservatory.nasa.gov/Features/BlueMarble/.

    Import[
     "http://earthobservatory.nasa.gov/Features/BlueMarble/Images/BlueMarble_
       2005_SAm_09_1024.jpg"]

    Images

  • This function returns the size of an image.

    Import[
     "http://earthobservatory.nasa.gov/Features/BlueMarble/Images/BlueMarble_
       2005_SAm_09_1024.jpg", "ImageSize"]

    {1024, 1024}

    Note: The examples using files imported from the Internet may not work if their web addresses have been changed or the files themselves deleted.

Importing from databases is a frequent data import operation. For example: importing from an Access mdb file.

  • We import the complete structure of the database:

    Import[ "ExampleData/buildings.mdb", {"Datasets", "Buildings", "Labels"}]

    {Rank, Name, City, Country, Year, Stories, Height}

  • We use Short or Shallow to display just a few data elements after importing the entire database (this is practical if there’s only one table and it’s not very big).

    Import[ "ExampleData/buildings.mdb" ] // Short

    {{{1, Taipei 101, Taipei, Taiwan, 2004, 101, 508}, ≪19≫}}

  • We import the records associated to several fields and display them in a table format:

    Text[Style[Grid[Transpose[Import[
         "ExampleData/buildings.mdb", {"Datasets", "Buildings", "LabeledData",
          {"Rank", "Name", "City", "Height"}} ]], Frame → All], Small]]

    Images

For big databases (ORACLE, SQLServer, etc.), with a multitude of tables and a large number of records, it’s better to access them with an ODBC connection as we will show later.

From the URL address of a website we can limit the information selected to only what we are interested in. A good practice is to first open the URL in a browser and see what information suits our needs. After importing the web page you will notice that it will appear in Mathematica as a list of lists. Analyze how the information is structured. You need to find out what lists contain the information you want and then use [[...]] to extract it.

  • The following command opens the specified URL using the default browser on your system.

    SystemOpen["http://www.uxc.com/review/UxCPrices.aspx"]

  • From the previous web page (http://www.uxc.com/review/UxCPrices.aspx) we only import the information included in the frame: Ux Month-End Spot Prices as of MM, YYYY (Month-end spot price data), in US dollars and euros. The change from the previous month is included between parentheses. The following command was created after trial and error by downloading the entire web page and trying different combinations until getting the desired results. We tell Mathematica to show the results using small fonts with the command Style.

    Style[
     TableForm[Import["http://www.uxc.com/review/UxCPrices.aspx", "Data"][[
       2, 2]]], Small]

    Ux Month-End Prices as of November 28, 2016 [Change from previous month]
    1 US$=                              0.94226 €‡
    U 3 O 8 Price (lb)                  $18.25        ( -0.50 )
    NA Conv. (kgU)                      $5.85         ( Unch. )
    EU Conv. (kgU)                      $6.40         ( Unch. )
    NA UF 6 Price (kgU)                 $53.40        ( -1.45 )
    NA UF 6 Value § (kgU)               $53.53        ( -1.31 )
    EU UF 6 Value § (kgU)               $54.08        ( -1.31 )
    SWU (SWU)                           $47.00        ( -2.00 )

    The previous command will work as long as the web page keeps the same structure as when we accessed it (December 7, 2016).

If you know HTML then you might be interested in accessing the contents of a web page by using the function: URLFetch.

You can also import data through WolframAlfa using the free-form input.

  • In this example we analyze the evolution of the temperatures in Mexico City.

    Images

    Images

  • We can expand the output by clicking on the Images that appears in the upper right-hand corner. It will show us several sections available for selection. Here we have chosen the plot with the historical evolution of the average monthly temperatures by selecting from the contextual menu: Paste input for ▶ Subpod Content.

    WolframAlpha ["Temperatures in Mexico City since 1980 to today",
     {{"TemperatureChart:WeatherData", 1}, "Content"}]

    Images

To access multiple data from several fields is better to use the computable data functions that we will discuss in Chapter 5. These functions will always return the same output structure, while WolframAlpha outputs to the same question do not always return the data in the same format.

  • In this example we download and display the average temperature in a weather station (LESA) located in the province of Salamanca in Spain during a specific time period. We use the computable data function WeatherData combined with DateListPlot.

    DateListPlot[WeatherData["LESA", "MeanTemperature",
      {{2014, 8, 1}, {2016, 7, 31}, "Day"}], Joined → True]

    Images

2.2.2 Semantic Import

Since Mathematica 10 there’s a new function named SemanticImport that detects and interprets the type of objects imported.

  • We import sales data for different cities and dates.

    sales = SemanticImport["ExampleData/RetailSales.tsv"]

    Images

  • If you move the cursor over the table above, you will see that the program identifies whether the data points are dates or cities, with the later ones displaying additional information. In Mathematica, objects with these properties are called entities (Entity) and have certain characteristics that we will explore later on. For example: If you hover the mouse over Boston, the following message will appear on the screen: Entity[“City”, “Boston”, “Massachusetts”, “UnitedStates”], that is, the data point refers to the city located in Massachusetts (USA) and not to other less famous Boston cities that might exist. To find the first city in the table type the following:

    Images

  • Since it’s a city you can directly obtain information about it, e.g. its population (Remember that % refers to the last output).

    %["Population"]

    655 884 people

  • In this example we import rows 3 to 5:

    SemanticImport["ExampleData/RetailSales.tsv", Automatic, "Rows"][[3 ;; 5]]

    Images

2.2.3 Export

The variety of export formats is quite extensive: $ExportFormats. We can even export in compression formats such as zip.

As it is the case when importing, if we don’t specify the export path, the files will be copied to our working directory (Data in this case).

  • Generate a 3D graph:

    gr = BarChart3D[{RandomInteger[7, {4}], RandomInteger[{-7, -1}, {4}]},
       ChartElementFunction → "Cone"]

    Images

  • Export the graph as a file named “cones.tif” in tiff format.

    Export["cones.tif", gr, "TIFF"];

    You can then import the image previously exported with: Import[“cones.tif”].

We are going to show how to export an animation.

  • First, we create an animation using Manipulate. In later chapters we will see this interesting function in more detail.

    example3DAnimation = Manipulate[
    Plot3D[{Cos[k x y], Sin[k x y]}, {x, 0, Pi},
       {y, 0, Pi}, Filling -> Automatic, PlotRange -> 1], {k, 0, 2}]

    Images

  • Then, we export it to a video format, in this case avi. (Normally this operation may last a few minutes).

    Export["example3DAnimation.avi", example3DAnimation]

    example3DAnimation.avi

  • Now we can either manually locate the exported file and open it or call it directly from Mathematica and have the system open it using the default application for avi files.

    SystemOpen["example3DAnimation.avi"]

2.3 Descriptive Statistics

When working with big volumes of data, it’s very common to perform a preliminary exploratory data analysis. In this section we are going to see how to do it in Mathematica: tutorial/BasicStatistics.

Strictly speaking, we should be talking about pseudorandom numbers since they are generated by algorithms. However, for most practical applications they can be considered in all respects random numbers.

We can also create customized charts as shown next.

2.4 Application: Analysis of the Evolution of Two Cell Populations

Let’s analyze the evolution over time of two cell populations.

The next example, showing how to customize a graph, includes options that we haven’t covered yet (we will explain them later in the book). As usual, if any of the instructions is not clear, consult the help by selecting the command and pressing F1. You can use this code as a template for similar cases. Alternatively, you might find it easier to create the labels and plot legends with the drawing tools Images.

2.5 Application: Global Energy Consumption Analysis

We are going to use what we have learned so far to analyze the evolution of energy consumption globally.

We are going to conduct several studies using this xlsx file. In principle we could directly modify the data using Microsoft® Excel and create a new workbook with only the required information for our analyses. This method would probably be the most suitable one but here, for educational purposes, we will perform all the required transformations in Mathematica. This will help us to have a better understanding of the file manipulation capabilities of the program, although it will require a careful attention to detail.

The file contains many sheets. Before importing it, open the spreadsheet in Excel or any other similar program and examine its contents. Note that each worksheet has a label describing its contents. This is normally very useful.

With the help of the following functions we are going to import part of the file.

2.5.1 Global Primary Energy Consumption by Source

We analyze the global consumption of primary energy (including all types: electricity, transportation, etc.) in 2015, the last available year in the file, from different sources (oil, gas, hydropower, nuclear, etc.)

  • From the sheet labels we conclude that the information we want is in the sheet “Primary Energy - Cons by fuel”.

    primaryenergyDatabyFuel =
      Import["bp-statistical-review-of-world-energy-2016-workbook.xlsx",
       {"Sheets", "Primary Energy - Cons by fuel"}];

  • We use TableView to see the imported data. You can either increase the number of visible rows with ImageSize or explore the data like in a spreadsheet, using the slide bars on the right and at the bottom to move around the sheet and clicking on specific cells.

    TableView[primaryenergyDatabyFuel]

    Images

  • Notice that the information we want is in row 3 (energy types) and row 89 (the total global consumption, in Million-ton equivalent of petroleum (MTEP) or Million tons of oil equivalent (MTOE)).

    {energytype1, mtoe}=
     {primaryenergyDatabyFuel[[3]], primaryenergyDatabyFuel[[89]]}

    {{Million tonnes oil equivalent, Oil, Natural Gas, Coal,
      Nuclear Energy, Hydro electric, Renew- ables, Total, Oil, Natural Gas,
      Coal, Nuclear Energy, Hydro electric, Renew - ables, Total},
     {Total World, 4251.59, 3081.46, 3911.18, 575.47, 884.288, 316.605,
      13 020.6, 4331.34, 3135.21, 3839.85, 583.135, 892.938, 364.861, 13 147.3}}

  • We store the 2015 data, located in columns 9 to 14 and replace the words “Renew- ables” and “Hydro electric” with “Renewables” and “Hydroelectric” respectively.

    {energytype, mtoe2015}={energytype1[[9 ;; 14]], mtoe[[9 ;; 14]]}

    {{Oil, Natural Gas, Coal, Nuclear Energy, Hydro electric, Renew- ables},
     {4331.34, 3135.21, 3839.85, 583.135, 892.938, 364.861}}

    {energytype [[5]], energytype[[6]]} = {"Hydroelectric", "Renewables"};

  • We represent the previous data in a bar chart. To see the actual consumption by source, in Mtep, place the cursor in the corresponding bar.

    BarChart[mtoe2015, BarOrigin → Left, BarSpacing → None,
     ChartStyle → "DarkRainbow", ChartLabels → energytype,
     AxesLabel → {Style[ "Energy Type", Bold], Style[ "MTEP", Bold]}]

    Images

  • We display the same data in a pie chart including the percentage share of the total for each energy source. Notice that fossil fuels represent the vast majority of the world primary energy consumption with renewables (excluding hydroelectric) playing a very small role (wind, sun, biomass, etc.)

    PieChart[mtoe2015/Total[mtoe2015], ChartStyle -> "DarkRainbow",
     ChartLabels -> energytype, LabelingFunction →
      (Placed [Row[{NumberForm[100#, {3, 1}], "%"}], "RadialCallout"] &),
     PlotLabel → Style["Sources of Energy (%)", Bold]]

    Images

    To highlight an individual source of energy just click on its wedge.

2.5.2 Global Primary Energy Consumption Temporal Evolution

In this section we are going to analyze the evolution of the global primary energy consumption and look for a function that will enable us to extrapolate energy consumption over the coming years.

  • We import the worksheet “Primary Energy - Consumption ”:

    primaryenergyDataYear =
      Import["bp-statistical-review-of-world-energy-2016-workbook.xlsx",
       {"Sheets", "Primary Energy Consumption "}] ;

  • Below we show the imported data:

    TableView[primaryenergyDataYear]

    Images

  • Notice that the desired information is in row 3 (year) and row 90 (Total World, in Mtep). We also eliminate the first and the last two rows since they are not relevant.

    primaryevolution =
      Drop[Drop[Transpose[primaryenergyDataYear[[{3, 90}]]], 1], -2];

  • We plot the data. Using Tooltip, if you click on a curve point, its value will be shown:

    dataplot = ListLinePlot[Tooltip[ primaryevolution],
      PlotStyle → PointSize[.01], Mesh → All, AxesLabel → {"Year", "MTEP"},
      PlotLabel → Style["Energy Consumption Evolution", Bold]]

    Images

  • The shape of the data suggests that the growth has been approximately linear, although a polynomial fit could probably generate better results. Here we use Fit (Later, in Section 6.4, we will explain more about fitting). In this case, a polynomial with degree 2 is an appropriate choice.

    model = Fit[primaryevolution, {1, t, t^2}, t]

    4.20076×106 - 4390.21 t + 1.14738 t2

  • We show both the real data and the fitted curved extrapolated until the year 2020:

    Show[
     Plot[model, {t, 1965, 2020}, PlotStyle → {Pink, Thick}, PlotRange → All],
     dataplot, AxesLabel → {"Year", "MTEP"},
     PlotLabel → Style["Energy Consumption Extrapolation", Bold]]

    Images

  • We generate a table that includes: year (t), real consumption (r(t)) and fitted consumption (f(t)). This is done by applying the replacement rule {a, b} → {t, r(t), f(t)} to the data from primaryevolution. We replace t in the object model with a obtaining the fitted value f(t) for each year. We also apply Round to avoid the decimal separator in the last year.

    exportData = primaryevolution /. {a_, b_} -> {Round[a], b, model /. t -> a};

  • We include the data from 2000, that we extract with:

    TableForm[exportData [[36 ;; 51]]
     , TableHeadings → {None, {"Year", "Consumption\n (MTEP)", "Fit\n(MTEP)"}}]

    Images

  • We export, in Excel format, all the data included in exportData to a file named “primaryenergy.xlsx”.

    Export["primaryenergy.xlsx", exportData, "XLSX"];

  • Finally, we open the file we just created (you need to have Excel or other similar application that can open xlsx files).

    SystemOpen["primaryenergy.xlsx"]

2.5.3 China Population Forecast

We can use the free-form input to import data. In this example, we use it to forecast China’s population.

Images

Images

  • Time series data can be fitted using TimeSeriesModelFit.

    tsm = TimeSeriesModelFit[%]

    Images

  • Now, the evolution of China’s population over the next 10 years can be estimated using TimeSeriesForecast.

    DateListPlot[{tsm["TemporalData"], TimeSeriesForecast[tsm, {10}]},
     PlotLegends → {"Historical", "Forecast"},
     PlotLabel → Style["Chinese Population", Bold]]

    Images

2.6 Database Access with Open Database Connectivity (ODBC)

In this section we are going to learn how to access big databases such as the ones usually found in large organizations nowadays.

Access to big databases (ORACLE, SQLServer, etc) installed in servers is usually done through ODBC connections. Mathematica offers the possibility of making such connections using the DatabaseLink package (see help: DatabaseLink/tutorial/Overview).

2.6.1 A Basic Example

  • First we load the package for ODBC connections, containing the JDBC variant:

    Needs["DatabaseLink`"]

Then we establish the connection to the desired database.

  • In this case we are going to connect to the publisher.mdb database included in DatabaseResources /Examples (we write down the complete path to the database for demonstration purposes although it can also be done just by typing: OpenSQLConnection [“publisher”]).

    conn =
     OpenSQLConnection[ JDBC["hsqldb", FileNameJoin[ {$UserBaseDirectory,
        "DatabaseResources", "Examples", "publisher" }] ],
        "Name" -> "manualA", "Username" -> "sa"]

    Images

  • Here we show the name of the tables in the database.

    SQLTableNames[conn]

    {AUTHORS, EDITORS, PUBLISHERS, ROYSCHED,
     SALES, SALESDETAILS, TITLEAUTHORS, TITLEDITORS, TITLES}

  • We display the entries in the PUBLISHERS table with their corresponding headings.

    SQLSelect[conn, "publishers", "ShowColumnHeadings" → True] // TableForm

    PUB_ID   PUB_NAME                   ADDRESS          CITY         STATE
    0736     Second Galaxy Books        100 1st St.      Boston       MA
    0877     Boskone & Helmuth          201 2nd Ave.     Washington   DC
    1389     NanoSoft Book Publishers   302 3rd Dr.      Berkeley     CA

2.6.2 The Database Explorer

The Database Explorer (Figure 2.1) is a graphical interface used to interact with databases. Sometimes, it may be useful to make a query and later on copy it to a notebook using its capabilities as shown later on in the section. However, we believe it is better in general to use the SQL functions included in DatabaseLink as seen in the previous section.

  • To launch it you need to use the command DatabaseExplorer[] and use the Connection Tool to select the database:

    DatabaseExplorer[]

    Images

    Images

    Figure 2.1     The Database Explorer.

  • After clicking on the icon, the databases available in our system will be shown (Figure 2.2).

    Images

    Figure 2.2     The Connection Tool displaying all the databases available in the system.

  • Once we connect to a database, we will see its tables and columns. In this case we choose publisher and press Connect (Figure 2.3).

    Images

    Figure 2.3     Tables and columns of the publisher database.

  • With Show Advanced Options (the horizontal bar at the bottom in the previous screenshot) you can build advanced queries. However, in this example, to keep things simple, we just select AUTHORS and PHONE in the Columns section and click on Result (Figure 2.4):

    Images

    Figure 2.4     Query results.

  • The query and its result can be saved by clicking on the Create a notebook with this query icon (the right-most icon on the image above).

  • When pressed, a new Mathematica notebook is generated containing the SQL commands. We have copied the actual contents of the notebook in the next cell.

    SQLExecute[SQLSelect["publisher", {"AUTHORS"},
      {SQLColumn[{"AUTHORS", "PHONE"}]}, None, "SortingColumns" → None,
      "MaxRows" → 100, "Timeout" → 10, "Distinct" → False,
      "GetAsStrings" → False, "ShowColumnHeadings" → True]] // TableForm

    PHONE
    415 555–6543
    415 555–7543
    415 555–8675
    801 555–8653
    801 555–6539
    219 555–5434
    301 555–7864
    707 555–4504
    415 555–0764
    415 555–5489
    415 555–8741
    415 555–4378
    415 555–2463
    415 555–8513
    408 555–1246
    707 555–1861
    615 555–7862
    408 555–1556
    615 555–4561
    415 555–4781
    415 555–1568
    503 555–7865
    913 555–0156

  • Finally we need to close the connection.

    CloseSQLConnection [conn ];

2.6.3 Connecting to Database Servers

If you want to connect to a server, you will probably need to configure your ODBC connection tool to define the database location. This can be done in Windows XP/Vista/7/8 with: Control Panel ▶ Administrative Tools ▶ Data Sources ( ODBC) (Figure 2.5). If you have access to your organization’s database it should appear in the “User Data Sources:” window and you will be able to connect to it using this interface.

Images

Figure 2.5     The Windows ODBC Data Source Administrator Tool.

  • In this example we configure the access to a local MS Access database (Figure 2.6).

    Images

    Figure 2.6     Configuring access to a local MS Access database.

  • Next you will be able to connect to it using Mathematica following a procedure similar to this one:

    Needs["DatabaseLink`"]

    conn = OpenSQLConnection[JDBC["odbc", "Your DATABASE name"],
      "Username" → "Mylogin", "Password" → "Mypassword"]

  • To see the tables in the database type:

    SQLTableNames[conn]

    Import them or make the necessary queries.

  • It’s recommended to close the connection at the end.

    CloseSQLConnection [conn ];

2.7 Additional Resources

To access the following resources, if they refer to the help files, write their locations in a notebook (e.g., guide/ListManipulation), select them and press <F1>. In the case of external links, copy the web addresses in a browser:

List Manipulation: guide/ListManipulation

Importing and Exporting Data: tutorial/ImportingAndExportingData

Basic Statistics: tutorial/BasicStatistics

Descriptive Statistics: tutorial/DescriptiveStatistics

Continuous Distributions: tutorial/ContinuousDistributions

Discrete Distributions: tutorial/DiscreteDistributions

Convolutions and Correlations: tutorial/ConvolutionsAndCorrelations

How to Do Statistical Analysis: howto/DoStatisticalAnalysis

Curve Fitting: tutorial/CurveFitting

DatabaseLink Tutorial: DatabaseLink/tutorial/Overview