So far, we have seen how to install Python, OpenCV, and a few other packages (numpy and matplotlib) from scratch, or using Anaconda distribution, which includes many popular data-science packages. In this way, some knowledge about the main packages for scientific computing, data science, machine learning, and computer vision is a key point because they offer powerful computational tools. Throughout this book, many Python packages will be used. Not all of the cited packages in this section will, but a comprehensive list is provided for the sake of completeness in order to show the potential of Python in topics related to the content of this book:
- NumPy (http://www.numpy.org/) provides support for large, multi-dimensional arrays. NumPy is a key library in computer vision because images can be represented as multi-dimensional arrays. Representing images as NumPy arrays has many advantages.
- OpenCV (https://opencv.org/) is an open source computer vision library.
- Scikit-image (https://scikit-image.org/) is a collection of algorithms for image processing. Images manipulated by scikit-image are simply NumPy arrays.
- The Python Imaging Library (PIL) (http://www.pythonware.com/products/pil/) is an image-processing library that provides powerful image-processing and graphics capabilities.
- Pillow (https://pillow.readthedocs.io/) is the friendly PIL fork by Alex Clark and contributors. The PIL adds image-processing capabilities to your Python interpreter.
- SimpleCV (http://simplecv.org/) is a framework for computer vision that provides key functionalities to deal with image processing.
- Mahotas (https://mahotas.readthedocs.io/) is a set of functions for image processing and computer vision in Python. It was originally designed for bioimage informatics. However, it is useful in other areas as well. It is completely based on numpy arrays as its datatype.
- Ilastik (http://ilastik.org/) is a user-friendly and simple tool for interactive image segmentation, classification, and analysis.
- Scikit-learn (http://scikit-learn.org/) is a machine learning library that features various classification, regression, and clustering algorithms.
- SciPy (https://www.scipy.org/) is a library for scientific and technical computing.
- NLTK (https://www.nltk.org/) is a suite of libraries and programs to work with human-language data.
- spaCy (https://spacy.io/) is an open-source software library for advanced natural language processing in Python.
- LibROSA (https://librosa.github.io/librosa/) is a library for both music and audio processing.
- Pandas (https://pandas.pydata.org/) is a library (built on top of NumPy) that provides high-level data computation tools and easy-to-use data structures.
- Matplotlib (https://matplotlib.org/) is a plotting library that produces publication-quality figures in a variety of formats.
- Seaborn (https://seaborn.pydata.org/) is a graphics library that is built on top of Matplotlib.
- Orange (https://orange.biolab.si/) is an open source machine learning and data-visualization toolkit for novices and experts.
- PyBrain (http://pybrain.org/) is a machine learning library that provides easy-to-use state-of-the-art algorithms for machine learning.
- Milk (http://luispedro.org/software/milk/) is a machine learning toolkit focused on supervised classification with several classifiers.
- TensorFlow (https://www.tensorflow.org/) is an open source machine learning and deep learning library.
- PyTorch (https://pytorch.org/) is an open source machine learning and deep learning library.
- Theano (http://deeplearning.net/software/theano/) is a library for fast mathematical expressions, evaluation, and computation, which has been compiled to run on both CPU and GPU architectures (a key point for deep learning).
- Keras (https://keras.io/) is a high-level deep learning library that can run on top of TensorFlow, CNTK, Theano, or Microsoft Cognitive Toolkit.
- Django (https://www.djangoproject.com/) is a Python-based free and open source web framework that encourages rapid development and clean, pragmatic design.
- Flask (http://flask.pocoo.org/) is a micro web framework written in Python based on Werkzeug and Jinja 2.
All these packages can be organized based on their main purpose:
- To work with images: NumPy, OpenCV, scikit-image, PIL Pillow, SimpleCV, Mahotas, ilastik
- To work in text: NLTK, spaCy, NumPy, scikit-learn, PyTorch
- To work in audio: LibROSA
- To solve machine learning problem: pandas, scikit-learn, Orange, PyBrain, Milk
- To see data clearly: Matplotlib, Seaborn, scikit-learn, Orange
- To use deep learning: TensorFlow, Pytorch, Theano, Keras
- To do scientific computing: SciPy
- To integrate web applications: Django, Flask
Additional Python libraries and packages for AI and machine learning can be found at https://python.libhunt.com/packages/artificial-intelligence.