Tuesday, 27 November 2018

Top 20 Python Machine Learning Open Source Projects

1. TensorFlow

TensorFlow was originally developed by researchers and engineers working on the Google Brain Team within Google's Machine Intelligence research organization. The system is designed to facilitate research in machine learning, and to make it quick and easy to transition from research prototype to production system. Contributors: 1324 (168% up), Commits: 28476, Stars: 92359.

2. Scikit-learn

Scikit-learn is simple and efficient tools for data mining and data analysis, accessible to everybody, and reusable in various context, built on NumPy, SciPy, and matplotlib, open source, commercially usable - BSD license. Contributors: 1019 (39% up), Commits: 22575

3. Keras

Keras, a high-level neural network API written in Python and capable of running on top of TensorFlow, CNTK, or Theano. Contributors: 629 (new), Commits: 4371

4. PyTorch

PyTorch, Tensors and Dynamic neural networks in Python with strong GPU acceleration. Contributors: 399 (new), Commits: 6458

5. Theano

Theano allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Contributors: 327 (24% up), Commits: 27931

6. Gensim

Gensim is a free Python library with features such as scalable plain-text documents for semantic structure, retrieve semantically similar documents. Contributors: 262 (81 % up), Commits: 3549

7. Caffe

Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by the Berkeley Vision and Learning Center (BVLC) and community contributors. Contributors: 260 (21% up), Commits: 4099

8. Chainer

Chainer is a Python-based, standalone open source framework for deep learning models. Chainer provides a flexible, intuitive, and high performance means of implementing a full range of deep learning models, including state-of-the-art models such as recurrent neural networks and variational auto-encoders. Contributors: 154 (84% up), Commits: 12613 

9. Statsmodels

Statsmodels is a Python module that allows users to explore data, estimate statistical models, and perform statistical tests. An extensive list of descriptive statistics, statistical tests, plotting functions, and results statistics are available for different types of data and each estimator. Contributors: 144 (33% up), Commits: 9729

10. Shogun

Shogun is Machine learning toolbox which provides a wide range of unified and efficient Machine Learning (ML) methods. The toolbox seamlessly allows to easily combine multiple data representations, algorithm classes, and general purpose tools. Contributors: 139(32% up), Commits: 16362

11. Pylearn2

Pylearn2 is a Machine learning library. Most of its functionality is built on top of Theano. The means you can write Pylearn2 plugins (new models, algorithms, etc) using mathematical expression, and Theano will optimize and stabilize those expression for you, and compile them to a backend of your choice (CPU or GPU). Contributors: 119 (3.5% up), Commits: 7119

12. NuPIC

NuPIC is an open source project based on a theory of neocortex called Hierarchical Temporal Memory (HTM). Parts of HTM theory have been implemented, tested, and used in applications, and other parts of HTM theory are still being developed. Contributors: 85 (12% up), Commits: 6588

13. Neon

Neon is Nervana's Python-based deep learning library. It provides ease of use while delivering the highest performance. Contributors: 78 (66% up), Commits: 1112

14. Nilearn

Nilearn is a Python module for fast and easy statistical learning on Neurolmaging data. It leverages the scikit-learn Python toolbox for multivariate statistics with applications such as predictive modelling, classification, decoding, or connectivity analysis. Contributors: 69 (50% up), Commits: 6198

15. Orange3

Orange3 is open source machine learning and data visualization for novice and expert. Interactive data analysis workflows with a large toolbox. Contributors: 53 (33% up), Commits: 8915

16. Pymc

Pymc is a python module that implements Bayesian statistical models and fitting algorithms, including Markov chain Monte Carlo. Its flexibility and extensibility make it applicable to a large suite of problems. Contributors: 39 (5.4% up), Commits: 2721

17. Deap

Deap is a novel evolutionary computation framework for rapid prototyping and testing of ideas. It seeks to make algorithm explicit and data structures transparent. It works in perfect harmony with parallelisation mechanism such as multiprocessing and SCOOP. Contributors: 39 (86% up), Commits: 1960

18. Annoy

Annoy (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given query point. It also creates large read-only file-based data structures that are mapped into memory so that many processes may share the same data. Contributors: 35 (46% up), Commits: 527

19. PyBrain

PyBrain is a modular Machine Learning Library for Python. Its goal is to offer flexible, easy-to-use yet still powerful algorithms for Machine Learning Tasks a variety of predefined environments to test and compare your algorithms. Contributors: 32 (3 % up), Commits: 992

20. Fuel

Fuel is a data pipeline framework which provides your machine learning models with the data they need. It is planned to be used by both the Blocks and Pylearn2 neural network libraries. Contributors: 32(10% up), Commits: 1116

No comments:

Post a Comment

Popular Posts