Your Career Platform for Big Data

Be part of the digital revolution in Switzerland

 

Latest Jobs

GSI Consultants Zürich, Switzerland
30/06/2018
Full time
Kategorie : Software Entwicklung/Software Architektur Firmenbeschreibung Pro Informatik is an IT consulting company providing high-quality services to our customers in various businesses. Our aim is to deliver first-class consulting services in the field of information technology. We have strong expertise in project management, business analysis, requirements engineering, software engineering, system engineering, testing as well as application and system administration. Aufgaben Responsibilities Develop state-of-the-art low-latency high-throughput solutions based on the StreamBase technology Build and maintain automatic tests Ensure high QA standards Tune and optimize the performance of systems and softwares Organize deliveries with the required business and IT stakeholders Assimilate and understand the Markets IT landscape to identify key systems Provide production support of business-critical applications with fast response times Anforderungen Requirements Experience in/with: Streaming application developments Designing and implementing large scale enterprise applications Any CEP frameworks (StreamBase, Business Events, Software AG Apama, IBM Streams, Oracle CEP, SAP Event Stream etc.) Relational databases such as Oracle and MS SQL-Server Database design and SQL enhancement Very good language skills in English Nice to have Experience with BPM systems (iProcess, Bonita, Shark etc.) Experience with enterprise messaging (IBM MQ, RabbitMQ, TIBCO Rendezvous, TIBCO FTL) Experience with Elasticsearch and the ELK Stack Persönlichkeit Strong communication skills Very good teamplayer Self-driven and ambitious Angebot We are a strong customer focused company working mainly for Top 500 companies. We pay great attention to further training and development, carrier opportunities and working conditions. We provide these opportunities to Swiss and EU residents. If you are strong in the above mentioned fields do not hesitate to send us your application and let us inform you about this. Galman Alexander +41 (0)44 252 50 51 alexander@pro-info.ch Jetzt bewerben
indigita Zürich, Switzerland
30/06/2018
Full time
indigita is a RegTech company created to solve the problems facing banks and financial advisers with international business when it comes to cross border compliance. indigita has developed perfect solutions to face new regulatory challenges and be a pioneer in digitalized regulatory data production and distribution. To strengthen our team in Zurich, we are looking for a:  Java Developer After the introduction into our application portfolio and our development environment you will join the development project focusing on IT and regulatory aspects. This is an outstanding opportunity to participate to the development of a new and innovative business. Your responsibilities Design, architecture and development of a unique RegTech solution for banking sector End-to-end development of application including front-end, web API and back-end development Work with the team leader to evaluate changes and enhance the solution Enhancing overall concepts and evaluating business impacts of any changes Your qualifications University degree in IT engineering At least 4 years of experience in Java, Spring and OOP Good understanding of architecture principles, development frameworks and design patterns Hands on experience implementing REST services Practical knowledge of relational (MariaDB) and NoSQL databases (MongoDB, Elasticsearch) Testing and quality driven Experience with UI technologies and Angular Nice to have: Knowledge of Groovy Excellent interpersonal and communication skills with a good command of written and spoken English, French and German would be an advantage Swiss or EU passport We offer you A unique opportunity to be part of a fast growing and dynamic RegTech company. Diverse and challenging tasks, possibilities to develop multi-disciplinary IT / business skills, training path and a competitive compensation package. If you are interested in a new challenge within a young and very motivated team, please send your complete application.

DataCareer Blog

Are you looking for real world data science problems to sharpen your skills? In this post, we introduce you to four platforms hosting data science competitions. Data science competitions can be a great way for gaining practical experience with real world data, and for boosting your motivation through the competitive environment they provide. Check them out, competitions are a lot of fun! Kaggle Kaggle is the best known platform for data science competitions. Data scientists and statisticians compete to create the best models for describing and predicting the data sets uploaded by companies or NGOs. From predicting house prices in the US to demographics of mobile phone users in China or the properties of soil in Africa, Kaggle offers many interesting challenges to solve real world problems. Check out their No Free Hunch Blog featuring the winners of each competition. The platform was recently acquired by Alphabet, Google’s parent company, and also offers a wide range of datasets to train your algorithms and other useful resources to improve your data science skill set.   DrivenData Similar to other platforms, the dataset is available online and participants submit their best predictive models. The great thing about DrivenData competitions is that the competition question and datasets are related to the work of non-profits, which can be especially interesting to those who want to contribute to a good cause. Furthermore, the data problems are no less diverse and range from predicting dengue fever cases, to estimating the penguin population in the Antarctic and forecasting energy consumption levels.  For some challenges, the best model wins a prize, for others you get the glory and the knowledge that you applied your skillset to make the world a better place. DrivenData offers great opportunities to tackle real-world problems with real-world impact. Numerai Numerai is a data science competition platform focusing on finance applications. What makes their competitions particularly interesting is that the participants’ predictions are used in the underlying hedge fund. Data scientists entering Numerai’s tournaments currently receive an encrypted data set every week. The data set is an abstract representation of stock market information that preserves its structure without revealing details. The data scientists then create machine-learning algorithms to find patterns in the data, and they test their models by uploading their predictions to the website. Numerai, then creates a meta-model from all submissions to make its investments. The models get ranked, with the top 100 earning Numeraire coins, a cryptocurrency launched by Numerai. Numerai's mix of data science, cryptography, artificial intelligence, crowdsourcing and bitcoin has given the fledgling business an exciting flair.   // Tianchi Tianchi is a data competition platform by Alibaba Cloud, the cloud computing arm of Alibaba Group, and has strong similarities with Kaggle. The platform focuses on Chinese data scientist, but most pages are also available in English. Tianchi boasts a community of over 150,000 data scientists, 3,000 institutes and business groups from over 80 countries. Besides the competitions, the platform also offers datasets and a notebook to run Python 3 scripts.      
Companies use machine learning to improve their business decisions. Algorithms select ads, predict consumers’ interest or optimize the use of storage. However, few stories of machine learning applications for public policy are out there, even though public employees often make comparable decisions. Similar to the business examples, decisions by public employees often try to optimize the use of limited resources. Algorithms may assist tax authorities in improving the allocation of available working hours, or help bankers make lending decisions. Similarly, algorithms can be employed to guide decisions taken by social workers or judges. // This blogpost lists three research papers that analyze and discuss the use of machine learning for very specific problems in public policy. While the potential seems huge, we do not want to neglect some of the many potential pitfalls for machine learning in public policy. Business applications often maximize profits. For policy decisions, however, the maximizable outcome may be harder to define or multidimensional. In many cases, not all relevant outcome dimensions are directly observable and measurable, which makes it more difficult to evaluate the impact of an algorithm. Tech companies would usually obtain training datasets through experimenting, while datasets for public policy often contain only one outcome for a specific group of people. If tax authorities never scrutinize restaurants, how can we form a predictive model for this industry? Predictions for public policy problems often face this so-called selected labels problem and it needs innovative approaches and the willingness to perform randomized experiments to get around it. This is just a brief list. Susan Athey’s paper provides more food for thought on the potential - and potential pitfalls - of using prediction in public policy.   Research on Machine Learning Applications in Public Policy Improving refugee integration through data-driven algorithmic assignment Developed democracies are settling an increased number of refugees, many of whom face challenges integrating into host societies. We developed a flexible data-driven algorithm that assigns refugees across resettlement locations to improve integration outcomes. The algorithm uses a combination of supervised machine learning and optimal matching to discover and leverage synergies between refugee characteristics and resettlement sites. The algorithm was tested on historical registry data from two countries with different assignment regimes and refugee populations, the United States and Switzerland. Our approach led to gains of roughly 40 to 70%, on average, in refugees’ employment outcomes relative to current assignment practices. This approach can provide governments with a practical and cost-efficient policy tool that can be immediately implemented within existing institutional structures. Bansak, K., Ferwerda, J., Hainmueller, J., Dillon, A., Hangartner, D., Lawrence, D., & Weinstein, J.; Science, 2018 Switzerland is currently implementing an algorithm based allocation of refugees. We are excited to see first results!   Human Decisions and Machine Predictions Can machine learning improve human decision making? Bail decisions provide a good test case. Millions of times each year, judges make jail-or-release decisions that hinge on a prediction of what a defendant would do if released. The concreteness of the prediction task combined with the volume of data available makes this a promising machine-learning application. Yet comparing the algorithm to judges proves complicated. First, the available data are generated by prior judge decisions. We only observe crime outcomes for released defendants, not for those judges detained. This makes it hard to evaluate counterfactual decision rules based on algorithmic predictions. Second, judges may have a broader set of preferences than the variable the algorithm predicts; for instance, judges may care specifically about violent crimes or about racial inequities. We deal with these problems using different econometric strategies, such as quasi-random assignment of cases to judges. Even accounting for these concerns, our results suggest potentially large welfare gains: one policy simulation shows crime reductions up to 24.7% with no change in jailing rates, or jailing rate reductions up to 41.9% with no increase in crime rates. Moreover, all categories of crime, including violent crimes, show reductions; these gains can be achieved while simultaneously reducing racial disparities. These results suggest that while machine learning can be valuable, realizing this value requires integrating these tools into an economic framework: being clear about the link between predictions and decisions; specifying the scope of payoff functions; and constructing unbiased decision counterfactuals. Jon Kleinberg  Himabindu Lakkaraju  Jure Leskovec Jens Ludwig  Sendhil Mullainathan; Quarterly Journal of Economics, 2018 // Using Text Analysis to Target Government Inspections: Evidence from Restaurant Hygiene Inspections and Online Reviews Restaurant hygiene inspections are often cited as a success story of public disclosure. Hygiene grades influence customer decisions and serve as an accountability system for restaurants. However, cities (which are responsible for inspections) have limited resources to dispatch inspectors, which in turn limits the number of inspections that can be performed. We argue that NLP can be used to improve the effectiveness of inspections by allowing cities to target restaurants that are most likely to have a hygiene violation. In this work, we report the first empirical study demonstrating the utility of review analysis for predicting restaurant inspection results. Kang, J. S., Kuznetsova, P., Choi, Y., Luca, M., 2013 , Technical Report Here is related paper on the same topic suggesting ways for governments on how to obtain the required expertise: Crowdsourcing City Government: Using Tournaments to Improve Inspection Accuracy Further readings: Two papers with an excellent overview on the topic Machine Learning: An Applied Econometric Approach Prediction Policy Problems The Economist on the same topic: Of prediction and policy, The Economist 2016  
Curious about neural networks and deep learning? This post will inspire you to get started in deep learning. Why are we witnessing this kind of build up for neural networks? It is because of their amazing applications. Some of their applications include image classification, face recognition, pattern recognition, automatic machine translation, and so on. So, let’s get started now. Machine Learning is a field of computer science that provides computers the capability to learn and improve from experience without being programmed explicitly. Deep learning is a form of machine learning that uses a computing model that is highly inspired by the structure of the brain. Hence, we call this computing model as a Neural Network. A neural network is a computing system comprising highly interconnected and simple processing elements which process the information through their dynamic state response to external inputs. A ‘neuron’ is the fundamental processing element of a neural network. The neural network comprises a large number of neurons working simultaneously to solve specific problems. This article explains the concept of neural networks and why they are a vital component in the process of deep learning. It also helps to let you know:- The advantages of neural networks over conventional techniques Working of Neural networks, Working of a Neural Network - Training, Working of a Neural Network - Learning Rules Network models and algorithms of Neural Networks   Why Neural Networks Matter in Deep Learning? Consider machine learning as a pack horse for processing information, then a carrot that draws the horse forward is the neural network. A system should not be programmed to execute a specific task for it to be able to learn truly; instead, it must be programmed to learn to execute the task. To accomplish this, the system uses deep learning (a more refined form of machine learning) which is based on neural networks. With the help of neural networks, the system can perceive data patterns independently to learn how to execute a task.   Advantages of Neural Networks over Conventional Techniques Depending on the strength of internal data patterns and the nature of the application, you can usually expect a network to train well. This is applied to problems where the relationships may be quite nonlinear or dynamic. Very often, the conventional techniques are limited by strict assumptions of variable independence, linearity, normality, etc. As neural network can capture various types of relationships, it enables the user to relatively easily and quickly model phenomena which otherwise may have been impossible or very difficult to explain.     // Working of a Neural Network Neural networks are modeled after the neuronal structure of the brain’s cerebral cortex but on smaller scales. They are usually organized in layers. Layers are comprised of many nodes which are interconnected and contain an activation function. The patterns are presented to the network through the input layer. This layer communicates to hidden layers (one or more in number) where the real processing is carried out through a system of weighted connections. Then, the hidden layers(neural hidden layer as shown in the below figure) are connected to an output layer(neural output layer as shown in the below figure) and it is the answer as depicted in the image shown below. The information flows via a neural network in 2 ways. When the neural network is operating normally (after its training) or learning (during training), the information patterns are fed into the network through input units. These input units will trigger the hidden unit layers and these in turn will arrive at the output units. This design is considered as the feedforward network. Every unit gets inputs from the units situated on its left. Then, the inputs are multiplied by the connections’ weights they travel along. Each unit sums up every input it receives in its way and the unit triggers the units situated on its right if the sum is more than a certain threshold value. In the below section, we will see how a neural network learns.   Working of a Neural Network - Training Training a neuron involves applying a set of steps to adjust the thresholds and weights of its neurons. This kind of adjustment process (also known as learning algorithm) tunes the network so that the outputs of the network are very close to the desired values. The network is ready to be trained once it is structured for a specific application. The initial weights are selected randomly to begin this process. Then, the training or learning starts. There are two approaches to training - unsupervised and supervised. In supervised training, the network is provided with the desired output in two ways. The first one involves manually grading the performance of the network and the second one is by allocating the desired outputs with the inputs. In unsupervised training, the network must make sense of the inputs without the help from outside. To put this in familiar terms, let’s consider an instance. Your kids are called supervised if you provide a solution to them during every situation in their life. They are called unsupervised if your kids make decisions on their own out of their understanding.   Most of the neural networks consist of some form of learning rule which alters the weights of connections according to the input patterns that are presented to it. Like their biological counterparts, the neural networks learn by example.     Working of a Neural Network - Learning Rules Neural networks use various kinds of learning rules. They are as follows. Hebbian Learning Rule - This learning rule determines, how to alter the weight of nodes of a network. Perceptron Learning Rule - The network begins its learning by allocating a random value to each weight. Delta Learning Rule - The modification in a node’s sympatric weight is equal to the multiplication of input and the error. Correlation Learning Rule - It is the supervised learning. Outstar Learning Rule - It can be used when it assumes that neurons or nodes in a network are arranged in a layer. The Delta Learning Rule is often used by the most common class of neural networks known as BPNNs (backpropagation neural networks). Backpropagation implies the backward propagation of error.   // Major Neural Network Models The primary neural network models are as follows. Multilayer perceptron - This neural network model maps the input data sets onto a set of appropriate outputs. Radial Basis Function Network - This neural network uses radial basis functions as activation functions. Both the above models are supervised learning networks, and they are used with one or more dependent variables at the output. Kohonen Network - This is an unsupervised learning network. This is used for clustering process.   Neural Network Algorithms As I stated earlier, the procedure used to perform the learning process in a neural network is known as the training algorithm. There are various training algorithms with different performance and characteristics. The major ones are Gradient Descent (used to find the function’s local minimum) and Evolutionary Algorithms (based on the concept of survival of the fittest or natural selection in biology).   Deep Neural Networks Deep Neural Networks can be thought of as the components of broader applications of machine learning that involve algorithms for regression, classification, and reinforcement learning(a goal-oriented learning depending on interaction with the environment). These networks are distinguished from single-hidden-layer neural networks by their depth. This implies the number of node layers through which the data passes in a pattern recognition’s multi-step process. Conventional machine learning depends on shallow networks that are composed of one output and one input layer with at most one hidden layer in-between. Including input and the output, more than three layers qualify as ‘deep’ learning. A deep neural network is shown in the below figure which has three hidden layers apart from the input and output layers. Hence, deep is a technical and strictly defined term that implies more than one hidden layer. Based on the previous layer’s output, each layer of nodes trains on a different feature set in deep neural networks.   Unlike most traditional machine learning algorithms, deep neural networks carry out automatic feature extraction without intervention. These networks can discover latent structures within unstructured(raw data), unlabeled data which is the majority of data in the world. A deep neural network which is trained on labeled data can be applied to raw data. This gives the deep neural network access to much more input when compared with machine learning networks. This indicates higher performance as the accuracy of a network depends on how much data it is trained on. Training on more data results in higher accuracy.   Applications of Neural Networks in Python and R Python Libraries using Neural Networks   Theano Theano is an open source project released under the BSD license. At its heart, Theano is a compiler for mathematical expressions in Python. It knows how to take your structures and turn them into very efficient code that uses NumPy, efficient native libraries like BLAS and native code (C++) to run as fast as possible on CPUs or GPUs. It uses a host of clever code optimizations to squeeze as much performance as possible from your hardware. The actual syntax of Theano expressions is symbolic, which can be off putting to beginners used to normal software development. Specifically, expression are defined in the abstract sense, compiled and later actually used to make calculations. It was specifically designed to handle the types of computation required for large neural network algorithms used in Deep Learning. It was one of the first libraries of its kind and is considered an industry standard for Deep Learning research and development.   TensorFlow TensorFlow is an open source library for fast numerical computing. It was created and is maintained by Google and released under the Apache 2.0 open source license. The API is nominally for the Python programming language, although there is access to the underlying C++ API. Unlike other numerical libraries intended for use in Deep Learning like Theano, TensorFlow was designed for use both in research and development and in production systems, not least RankBrain in Google search and the fun Deep Dream project. It can run on single CPU systems, GPUs as well as mobile devices and large scale distributed systems of hundreds of machines. It’s easy to classify TensorFlow as a neural network library, but it’s not just that. Yes, it was designed to be a powerful neural network library. But it has the power to do much more than that. You can build other machine learning algorithms on it such as decision trees or k-Nearest Neighbors. You can literally do everything you normally would do in numpy! It’s aptly called “numpy on steroids.”   R Libraries using Neural Networks   Caret The caret package is a set of tools for building machine learning models in R. The name “caret” stands for C lassification A nd RE gression T raining. As the name implies, the caret package gives you a toolkit for building classification models and regression models. Moreover, caret provides you with essential tools for data splitting, pre-processing, feature selection, model tuning using resampling, variable importance estimation as well as other functionality. There are many different modeling functions in R. Some have different syntax for model training and/or prediction. The package started off as a way to provide a uniform interface the functions themselves, as well as a way to standardize common tasks (such parameter tuning and variable importance). Caret provides a simple, common interface to almost every machine learning algorithm in R. When using caret, different learning methods like linear regression, neural networks, and support vector machines, all share a common syntax (the syntax is basically identical, except for a few minor changes). Moreover, additional parts of the machine learning workflow – like cross validation and parameter tuning – are built directly into this common interface. To say that more simply, caret provides you with an easy-to-use toolkit for building many different model types and executing critical parts of the ML workflow. This simple interface enables rapid, iterative modeling. In turn, this iterative workflow will allow you to develop good models faster, with less effort, and with less frustration.   nnet There are many ways to create a neural network. You can code your own from scratch using a programming language such as C# or R. You can also use a tool such as the open source Weka or Microsoft Azure Machine Learning. The R language has an add-on package named nnet that allows you to create a neural network classifier. The nnet R package has been created by Brian Ripley. You can evaluate the accuracy of the model and make predictions using the nnet package. The functions in the nnet package allow you to develop and validate the most common type of neural network model, i.e, the feed-forward multi-layer perceptron. The functions have enough flexibility to allow the user to develop the best or most optimal models by varying parameters during the training process.   // Conclusion Neural networks have broad applicability to business problems in the real world. They are currently used applied in various industries, and their applicability is getting increased day-by-day. The primary neural network applications include stock exchange prediction, image compression, handwriting recognition, fingerprint recognition, feature extraction, and so on. But, there is a lot more research that is going on in neural networks.   Author: Savaram Ravindra is a writer on Mindmajix.com working on data science related topics. Previously, he was a Programmer Analyst at Cognizant Technology Solutions. He holds a MS degree in Nanotechnology from VIT University
View all blog posts


Looking for data professionals?

 

Post a Job