Here the barriers will be analyzed using experimental design, statistical inference, queueing analysis, and simulation modeling. Fadiran Oluwafemi, An Investigation on Factors affecting Internet Banking in Nigeria, April 2016, (Yan Yu, Edward Winkofsky) [104] [105] Since the 2015 AFC Asian Cup triumph, the supporters had encouraged people in Australia to focus more on the national team, and the nation's soccer pride. Our methodology includes a consulting approach to first understand the problem from client stakeholders, and then apply data cleaning, wrangling, exploration, and visualizations to uncover trends and insights. The project will aim at creating a translator that utilizes and compares various Machine Learning algorithms based on model accuracy to predict alphabets corresponding to static hand gestures of American Sign Language. The data ranges from 2011-01-29 to 2016-06-19. The linear model provided better results than the regression tree to predict the interest rate for the applicants. The comparison between the low-sale-price range and the other ranges in terms of the property tax underpaid/overpaid clearly indicates that the home owners of the low-sale-price houses are heavily taxed by the local government in an inappropriate manner. The Logistic Regression model was chosen as the best model for the Kickstarter project classification with an accuracy of 0.9996 and AUC of 0.9999. However, the data is highly unbalanced among classes so we try to find out if the balancing group Synthetic Minority Oversampling Technique has notable effects in the performance of the accuracy and Area under the Curve of the classifiers. For providing the best viewing experience and retaining the users, it is important for the OTT platforms to seamlessly suggest movies to both existing and new users. MadTree is seeking to increase its market penetration in the areas it currently sells its craft beer, so I have developed linear regression models to illustrate the relative significance of different predictor variables. It is never possible to completely gauge how a new player will adjust to NBA play. LASSO selected two variables. The deeper networks I built with a much higher number of neurons or layers fared worse than the smaller network. As of 2017, Spotify still isn’t profitable. We compare Default likelihood indicator (DLI) from Merton model with estimated default probability from logistic model using rank correlation and deciles rankings based on out-of-sample prediction. He had an illustrious career to say the least, holds numerous records and is regarded as one of the most celebrated players to ever grace the game. Nowadays all the countries are concerned about providing sufficient energy to the consumers as well as optimizing the total demand of energy consumed. However, as this product is first of its kind in the market, getting actual customer data isn't possible. Jeevisha Anandani, Recommender System, August 2020 (Peng Wang, Michael Thompson). In this project, the exploratory data analysis includes feature selection based on distributions, correlation and data visualizations. The results prove that not grouping the special events as a single entity yields a more reliable model and thus those predicted results were given to the Zoo for additional analysis. Prarthana Rajendra, Cincinnati Children’s Hospital and Medical Center, July 2017, (Jason Tillman, Michael Fry) TBATS performed the best providing a Train Absolute Mean Error of 0.124721 and a Test Absolute Mean Error of 7.236229. While it’s true that uncertainty in sports is the best thing, an increasing number of people are becoming proponents of analytics applied to Basketball. Major market for each store was identified, and trade areas were divided. Improving the accuracy of forecasts, hence, is one of the integral factors in the business planning of organizations. Sourapratim Datta, Product Recommendation: A Hybrid Recommendation Model, July 2018, (Michael Fry, Shreshth Sharma) Instacart, a grocery ordering and delivery app, aims to make it easy to fill your refrigerator and pantry with your personal favourites and staples when you need them. Of the different models-built, Support Vector Machines (SVM) was picked due to its higher F1 score, comparable accuracy. Without high quality of care coordination, patients can bounce back from home to the hospital and the emergency room, sometimes repeatedly. Throughout the summer, we have started prototyping the Portfolio Viewer module and putting structure around the Research Module. These methods make neural networks good at finding complex nonlinear relationships amongst predictor and response variables as well as interactions between predictor variables. Spam messages are identical messages sent to numerous recipients by email or text messages for reasons like mass marketing, gain clicks on website, scam users and steal data etc. In our analysis, we chose the primary elements to be Antimony (Sb), Germanium (Ge), Tellurium (Te) and the dopants for these combinations of primary elements are Ti, Bi, BiN, Mo, N, Sc, Al, AlSc, SiC, In, C, Si, SiN, O, W, Se, Er, Gd, Sn. This study focuses on providing subscribers a more customized way to explore their music and easily create their own playlists through clustering preferentially on chosen audio features reflective of their mood. The analyses also performed unsupervised clustering which suggested there were distinct groups of those strongly connected with the school through other affiliations and those who were not. Utilizing transaction level data, the following analysis serves to first, provide a summary of the variables that impact credit card approval rate, and secondarily, use that analysis as a foundation to design a reporting tool that will allow a user to track the approval rate over time and address the specific risks that arise. Cincinnati, OH 45221, University of Cincinnati | 2600 Clifton Ave. | Cincinnati, OH 45221 | ph: 513-556-6000, Alerts | Clery and HEOA Notice | Notice of Non-Discrimination | eAccessibility Concern | Privacy Statement | Copyright Information. Our team at TCP is always striving to develop new products for the private labelling customers as well our own e-commerce businesses. The modified forecast is then used in demand planning to request replenishments from the plants. This analysis uses Logistic Regression, Decision Tree, and Generalized Additive Model. In particular, within the logit regression context, a simple remedy could be applied to justify the cut-off probability, such that choice based sampling technique and the complete data sampling technique display the same explanatory power in forecasting the bankruptcy classification. This paper focuses on the cycle of data in a technological project, starting from instrumentation and tracking to reporting and deriving the business impact of the product. was posed, and through exploratory data analysis we show what the data hold, what they don't hold, possible deficiencies, and produce some insights into the segmentation of their donations. In multivariate analysis of covariance tests, firm size showed a significant positive association with overall firm performance while disruption event announcement showed a significant negative association with overall performance. Qi Sun, Application of Data-Mining Methods in Bank Marketing Campaigns, December 5, 2013 (Jeffrey Camm, Yichen Qin) This Data set was posted on Kaggle as a competition. (Dungang Liu, Liwei Chen) Ramkumar Selvarathinam, Identification of Child Predators using Naïve Bayes Classifier, June 2015, (Yan Yu, Michael Magazine) It is crucial in these countries to have disease resistant crops to progress the economy. Four models viz. In this case study is an analysis of risk with respect to mortgage aggregates in recessionary environments and a deeper look at how these variables behave and relate to each other. Today, in the fast moving consumer goods space, retailers are facing quite a lot of challenges in providing the best offerings in terms of variety and affordability of products. Over recent years, the utilization of antidepressant drugs in the form of doctors' prescriptions and Medicaid reimbursements has been rising steadily. However, Tomahawk simulation provides a confidence interval and does not require finding an ideal exponent. Many other results have very interesting biological implications. This analysis would include identifying bottlenecks via cycle times and machine breakdowns. Their business model in USA is divided into major FETs. Comunidade Portuguesa de Pro Cycling Manager. This project deals with online retail store data taken from UCI Machine Learning Repository. Based on the predictive model, they can know their customers' propensity of risk to churn. Many carriers have started working on SMS spam by allowing subscribers to report spam and taking action after appropriate investigation. One of its applications has been found in identifying diseased trees from Quickbird imagery. With the advent of E-Commerce Industries everything from household items to cars are being made available online. This analysis aims at segmenting customers into income groups of above and below 50K. R treats all the categorical variables as factors without any requirement of creating dummy variables. In other words, it’s a tool to understand the credit risk of a borrower. At the mathematical level the problem is abstract and exact removed from the practical problems of the real estate developer or marketing expert. Both comprise a large collection of packages for specific tasks and have a growing community that offers support and tutorials online. The key to great sales for a new product is knowing the right kind of customers (who are most profitable) for it and deploying your best agents (high performing sales persons) out to them. Based on these models, simulators are designed to evaluate the right price for a given product in order to facilitate minimum margin leakage. Analytics and sports have been around together for a while now, with advancements in sports technology, the application of analytics to sports increases with every passing day. This project examined which promotions are most effective for pastas, sauces, pancake mixes, and syrups in Kroger stores. The planning and replenishment for these items is performed using a basic MRP planning system. In this report, several different data mining, advanced statistical and machine learning techniques are explored and used to predict readmission rates. The question we are trying to answer here is whether we can differentiate between the two types of wine without looking at their appearance. Each category has important metrics the business users are concerned with. Different machine learning techniques are explored - logistic regression, random forest, gradient boosting and neural networks. It has information on every trip taken since September 2010. Natural Language processing is one of the most prominent techniques that helps us deal with unstructured data in a more effective and quick manner. The dataset on Kaggle had two data sets: one for training the model, this dataset had 100,514 observations and the testing dataset had 10353 observations. To automate the processes, I developed R code utilizing several packages, such as Tidyr and Dpylr, and I also used data cleaning and aggregation techniques. This project aims to create a tool that recommends the optimal price to maximize profit by using historic sales data and the price elasticity of demand for top selling items within each state in which EG America operates. Use cases such as auto-tagging of the compound and tumor types to free text, summarization of insights, identifying hot/trending/emerging topics would assist business teams to derive hidden insights. In this analysis, a content-based recommendation has been built based on the plot description of a movie. It has been observed that the use of contraceptives is much lower in developing countries as compared to the developed ones. Recommender systems help customers by suggesting probable list of products from which they can easily select the right one. Hence computer vision would provide more time – efficient methods to identify diseased crops. This report shows that the basic theory of the incremental response model and how the model is applied to an Oriental Trading Company dataset. Lori Mueller, Norwood Fire-Department Simulation Models: Present and Future, May 28, 2009 (David Kelton, Jeffrey Camm) Pallavi Singh, Anomaly Detection in Revenue Stream, August 2019, (Dungang Liu, Brittany Gearhart) This is done through the Extraction, Transformation, and Load processes. Yiyin Li, Foreclosure in Cincinnati Neighborhoods, July 2018, (Yan Yu, Charles Sox) Multivariate analysis refers to a set of statistical methods for simultaneously involving observations and analyses on each individual or object under investigation. Forecasting stock return is an important topic in the finance industry. Sandeep J. Patkar, Examination of Capacity and Delay at Airports and in the U.S. National Airspace System, January 14, 2013 (David Rogers, Edward Winkofsky) The target variable is the sales of a particular product at an outlet. They were stratified into age groups, with finer (6-month) categories for the younger children, and grosser (1-year) categories for the older children. Shashi Kanth Kandula, Internship with Ethicon (Johnson & Johnson), August 2015, (Andrew Harrison, Yichen Qin) The purpose of this paper is to apply various data and statistical techniques to analyze and model the bank marketing data and predict whether a client will subscribe for a term deposit. On the contrary, direct marketing focusses on a small set of people who are believed to be interested in the product. The goal of this study is to model the bankruptcy probabilities of seven major banks under different economic scenarios. Prediction models are built with logistic regression, classification tree and support vector machines algorithms, and their performances are compared. which are aggregated to a score and to analyze key factors driving this score. The analytics department was asked to come up with preliminary results that would either push the decision for the company to implement a test or not. This project involves an application of Markov-chain modeling to Ohio's unemployment rate. It helps them in structuring store lay out, designing various promotion and coupons and combining all with a customer loyalty card which makes all the above strategy even more useful. With deeper architecture employment, maximum accuracy of 97% can be achieved. Sandeep Kavadi, Analytical Approach to designing Financial Hardship Programs for Consumer Loan Products, August 2020, (Michael Fry, Siddharth Krishnamurthi). The Viticulture Commission of the Vinho Verde Region rated the wine quality using the physiochemical properties. Production sites perform partial functions of mixing and distribution centers, and products are directly shipped to retail stores via third-party service vendors. The 2019–20 season was Greuther Furths 117th season in existence, and their sevent consecutive season in the 2. Diabetes is an increasingly common disease among the U.S. population. They would like to open an additional Garden Center in the Ohio Kentucky and Indiana (OKI) region and need to know where the optimal location for it would be. Currently, there are over 500 bike-sharing programs around the world. We will explore some techniques like transfer learning to classify the images in this project. They can help real-estate developers better target their potential buyers and better plan for new construction. The goal of this study is to build models which will predict the amount of tickets sold with different factors. S. Zeeshan Ali, Image Classification with Transfer Learning, July 2017, (Peng Wang, Liwei Chen) Grouping volunteers by their donation behavior allowed United Way better to evaluate the interaction between volunteering and donating. Christopher Uberti, General Motors Energy/Carbon Optimization Group, July 2018, (Michael Fry, Erin Lawrence) Lian Duan, Fair Lending Analysis, August 2016, (Julius Heim, Dungang Liu) By implementing the findings from this study, Cintas could optimize its uniform service management and form a comprehensive sales strategy in various regions and industries. Deep convolution neural networks (CNN) are now widely used in medical imaging diagnosis for various diseases such as pneumonia, Alzheimer’s, cancer, diabetic blindness, etc. Aniket Sunil Mahapure, Quora Question Pairs Data Challenge, August 2019, (Peng Wang, Liwei Chen) The objective of this report is two-fold. This solution has reduced their time by 50%. Scoring models currently available are hugely specific to the web platforms, hence a novice model is required to be built from scratch in Cerkl’s context. The aim is to create a customized view of risk for each data set, using proprietary wildfire hazard grading. This project also predicts the top 5 recommended movies per user based on their historical ratings from the Movielens database. Matt Policastro, District Configuration Analysis through Evolutionary Simulation, July 2017, (Peng Wang, Michael Magazine) It was observed that the Time Series model produced the best forecast with an accuracy of within 18 units (for a 5 hour forecast) of the actual values on average. If the actual purchase does not happen because of any reason, seller has to be refunded the fee amount as a credit. All the models are built using the ‘sklearn’ package in Python. We are interested in identifying the patterns of the enrollment of students in the STEM programs in UC and understand if factors like gender, race, etc. One of the products is Customer 360, which provides a unified customer profile including internal and external data sources. While each method has its advantages and disadvantages, the models created using LASSO Regression to predict heating and cooling load, balance simplicity and accuracy relatively well. Yelp is an online platform both website and app, where people write about their experiences about a place they visited. The Nelson-Siegel factor model is used to fit the Treasury bond yield data from 1985 through 2000. The analysis as conducted is for overall Macy’s stores and on a region level. People usually purchase online products after looking at how much star rating it has and after shortlisting the product, they usually read several text reviews written by other customers who have purchased this product. Aishwarya Nalluri, Multiple Projects with Sevan Multi Site Solutions, July 2017, (Michael J. Fry, Doug Gafney) The unsuccessful students are defined as students enrolled in a program for more than 6 years and didn't graduate. Our client collects a huge amount of instreaming data from their smart restroom solutions and wants to capture business value out of the data. In my project, I have discussed how we can predict all the genres associated with a movie just by looking at the plot of the movie with the help of NLP and multi-label classification using algorithms like Naïve Bayes and Support Vector Machines. But SMS spam has steadily grown over the past decade. This makes the whole process of model building easier but in the hind sight the models includes all levels of a categorical variable without taking into consideration their significance. When deploying a wireless network in the telecom industry, it is important to develop a proper sales strategy that will maximize revenue while filling the network to capacity with sales to both residential and business customers. I have tested 2 different machine learning models(SVM and CNN) to find the best model for the classification task. We present a summary of published papers and textbooks written on this subject to discuss mathematical notation, covariance structures, and model-building and model-selection methods in LMMs. Air-fare and the cost of flying have always been as much a matter of discussion as they have been a matter of speculation. Ishali Tiwari, Prediction of Wine Quality by Mining Physiochemical Properties, August 2019, (Yan Yu, Ishan Gupta) The train set contains 60,000 labeled images and the test set contains 10,000. However, tree-based bagging and random forest methods provide excellent performance with potential over fitting problem. The reorder point prediction would reduce the frequency of ordering and would help the floor managers in making better reorder plans. This paper looks at a problem for a professional association of management science that organizes an annual symposium and invites papers in multiple branches of management. There is a substantial rise in the number of people who are engaging in learning activities either through a learning management system or through in-class learning technologies. There are a limited number of resources available to each team in the form of wickets and balls. Due to this, a large amount of diapers have to be disposed of leading to substantial monetary loss. Market basket analysis with association rules are used to discover the top strong rules of product association based on different association measures e.g. Sushmita Sen, Digit Recognition with Machine Learning, July 2017, (Yan Yu, Liwei Chen) An exploratory data analysis was performed on the data, to look for outliers and individual distributions of the variables. The first one was a comparison of the latency in the start time of the tool when setup with different servers, a Historian server and an MS SQL. Time series analysis is commonly used in economic forecasting as well as analyzing climate data over large periods of time. Direct mail campaign is one of the acquisition strategies employed by the Fifth Third Bank. The model is trained over a calibration period of 9 months and the predictions are tested for  a holdout period of 4 months. Text mining was applied to a character variable in order to investigate forms of reward given to kids. Hasnat Shad Tahir, Movie Recommendation Systems, August 2020, (Peng Wang, Dungang Liu). In the last 2 decades, the e-commerce industry has consistently leveraged data to improve sales, advertisement and customer experience. Variable ‘total interest received till data’ showed contrasting behaviour. Part of this project is the creation of a dataset using ArcGIS software to geocode addresses from the Hamilton County Auditor and identify their respective neighborhoods, which enables demographic information from census data to be joined as possible predictors. Juvin Thomas George, Automation of Customer-Centric Retail Banking Dashboards, August 2016, (Andrew Harrison, David Bolocan) Acquiring new customers as well as retaining the existing customers is quite challenging in the current scenario.