Retail EDI

UK ICO and French CNIL Increase Activity Around Cookies and Consent Practices

Perhaps the only thing higher than temperatures this summer in the European Union is the level of regulatory attention being paid to data-driven advertising and website cookie practices (including similar tracking technologies within mobile applications and other non-browser environments, collectively referred to here as “cookies”). This TrustArc blog post summarizes the major announcements and publications regulators have issued over the last few weeks, including what is expected to follow–and how TrustArc helps. UK ICO Report on Ad Tech, RTB and Privacy. First, the United Kingdom’s Information Commissioner’s Office (ICO) released on June 20th an “Update Report Into Adtech and Real Time Bidding,” which concluded that advertising technology-related entities and those involved in real time bidding (RTB) should reassess their privacy notices, lawful processing bases, and personal data uses and sharing in light of the GDPR, as many have not to this point. The ICO is in the midst of evaluating practices...

UK ICO and French CNIL Increase Activity Around Cookies and Consent Practices

Perhaps the only thing higher than temperatures this summer in the European Union is the level of regulatory attention being paid to data-driven advertising and website cookie practices (including similar tracking technologies within mobile applications and other non-browser environments, collectively referred to here as “cookies”). This TrustArc blog post summarizes the major announcements and publications regulators have issued over the last few weeks, including what is expected to follow–and how TrustArc helps. UK ICO Report on Ad Tech, RTB and Privacy. First, the United Kingdom’s Information Commissioner’s Office (ICO) released on June 20th an “Update Report Into Adtech and Real Time Bidding,” which concluded that advertising technology-related entities and those involved in real time bidding (RTB) should reassess their privacy notices, lawful processing bases, and personal data uses and sharing in light of the GDPR, as many have not to this point. The ICO is in the midst of evaluating practices...

State of #AI 2019 Report

I highly recommend  the #StateofAI 2019 report. I have followed this report from By Nathan Benaich and Ian Hogarth The report is free and you can download it at stateofai 2019 The report is kind of Mary Meeker theme for AI for me i.e. a great reference :)    here are my notes The full report has lots of slides charts and diagrams. My notes are text only and what was of interest    AI will be a force multiplier on technological progress because  everything around us today, ranging from culture to consumer products, is a product of intelligence.   The report considers the following key dimensions : Research, Talent, Industry, China(c0onsidered as a distinct category), Politics   Reinforcement learning (RL)  Rewarding ‘curiosity’ enables OpenAI to achieve superhuman performance at Montezuma’s Revenge.StarCraft integrates various hard challenges for ML systems: Operating with imperfect information, controlling a large action space in real time and making...

Privacy and AI - How Much Should We Really Care

Summary:  More data means better models but we may be crossing over a line into what the public can tolerate, both in the types of data collected and our use of it.  The public seems divided.  Targeted advertising is good but the increased invasion of privacy is bad.   Headlines are full of alarm.  The public is up in arms.  The internet is stealing their privacy.  Indeed, the Future of Humanity Institute at Oxford rates this as the most severe problem we will face over the next 10 years.   As data scientists how much should we care?  Well more data means better models and less data means less accurate models.  So in a sense the value we bring to the table will be directly impacted if government regulation takes many of our data sources off the table.  So the answer is likely we should care a lot. However, “privacy” has...

Major Factors Keeping Facial Recognition from Mass Adoption

Artificial Intelligence and Machine Learning are accelerating and refining various industries. One of the most rapidly developing and progressive domains is Facial Recognition (FR). Its implementation in many spheres, from public security to retail and healthcare, only proves its potential.     Despite FR’s broad dissemination, there are many precedents where FR still makes mistakes. Media reports are filled with stories of FR’s racial discrimination, for example. The reasons for such failures vary, yet companies already using the technology have hope for its improvement and future benefits.   A National Institute of Standards and Technology research has shown that since 2014 FR technology has been refined more than 20 times . Moreover, according to Allied Market Research, the value of facial recognition technology is likely to rise to $9.6 billion by 2022 with a CAGR of 21.3% between 2016–2022.   While the continued development of the technology seems almost a given...

Constructing Role Objects and Interpreting Role Conflicts Through the Lens of Stress

In my previous post , I discussed the relationship between role conflict and performance.  I suggested that all things being equal, role conflict might be the primary determinant of employee performance.  Companies direct all sorts of resources gathering data for recruitment purposes.  All things being about the same, much of that data collection is irrelevant.  This means that if a pool of recruitment prospects is relatively homogeneous in terms of their abilities, the balance of analysis should be focused on role conflict.  In this blog, I will consider the structure of role objects and the perspective of the stress lens. A role conflict occurs when two roles conflict.  For my model, each role object that a person has contains two components: 1) gates or the role prerequisites; and 2) traps or the role barriers.  An individual carries a persona containing a number of different roles.  When a particular gate is found...

Co-integration and Structural Breaks Time Series Analysis using R on 100 year bond yields

Co-Integration in Time Series Analysis is when one data points is depended on other data points or follow the pattern. Example in capital markets Industry or sector leader company stock leads the direction and many small companies follows it. Example : Crude oil and Gasoline prices. Price of gasoline is dependent on Crude oil prices. Here Crude oil price always drives gasoline prices.  To analyze similar co-integration used Moody's corporate AAA and BBB Bond Yields. Corporate bond BBB yields are co-integrated with yields of AAA.  These Bond yield prices are downloaded from FRED economic data St. Louis using getSymbols() function from package quantmod. This downloaded data is from 1920 to 2019  about 100 years.    After plotting it is clearly visible how bond yields are co-integrated .  Before plotting downloaded data is converted to time series using ts() function.                         ...

Scaling Innovation:  Whiteboards versus Maps

I love watching the NBA’s Golden State Warriors play basketball. Their offensive “improvisation” is a thing of beauty in their constant ball movement in order to find the “best” shot. They are a well-oiled machine optimizing split-second decisions in an ever-changing landscape that is heavily influenced by questions such as: Who is my defender?What are the strengths of my defender?From where is help likely to come if I make a move to the basket?Who is likely to be open if help does come?Who has a defensive mismatch?Who is hot?What’s the game situation?What is the shot clock status?Is this the “best” shot or should I keep looking? The coordinated decision-making is truly a thing of beauty, but here’s the challenge: how would you “scale” the Warriors? You can’t just add another player to the mix – even a perennial all-star like Boogie Cousins – and have the same level of success. One...

Deploying Python application using Docker and AWS

The use of Docker in conjunction with AWS can be highly effective when it comes to building a data pipeline. Let me ask you if you have ever had this situation before. You are building a model in Python which you need to send over to a third-party, e.g. a client, colleague, etc. However, the person on the other end cannot run the code! Maybe they don't have the right libraries installed, or their system is not configured correctly. Whatever the reason, Docker alleviates this situation by storing the necessary components in an image, which can then be used by a third-party to deploy an application effectively. In this example, we will see how a simple Python script can be incorporated into a Docker image, and this image will then be pushed to ECR (Elastic Container Registry) in AWS. Python Script Consider a simple Python script for calculating a cumulative binomial...

28 Statistical Concepts Explained in Simple English - Part 18

This resource is part of a series on specific topics related to data science: regression, clustering, neural networks, deep learning, decision trees, ensembles, correlation, Python, R, Tensorflow, SVM, data reduction, feature selection, experimental design, cross-validation, model fitting, and many more. To keep receiving these articles,  sign up on DSC . Below is the last article in the series Statistical Concepts Explained in Simple English. The full series is accessible  here .  Source for picture: here 28 Statistical Concepts Explained in Simple English - Part 18 Unidimensionality: Definition, Examples Uniform Distribution / Rectangular Distribution: What is it? Unimodal Distribution in Statistics Unit Root: Simple Definition, Unit Root Tests Univariate Analysis: Definition, Examples Upper and Lower Fences Upper Hinge and Lower Hinge Validity Coefficient: Definition and How to Find it Variability in Statistics: Definition, Examples Variance: Simple Definition, Step by Step Examples Variance Inflation Factor Voluntary Response Sample in Statistics: Definition Wald Test: Definition,...

A Digestible Action Plan for Startups’ Cybersecurity Success

It’s never too early for a start-up business to begin to strategize and operationalize its cybersecurity goals–in fact, it’s a necessary prerequisite for high-yield growth. And yet, with all the high velocity activity and rapid decision-making that characterizes most startups’ early existence, it can be easy to overlook some of the critical prophylactic steps that must be taken to safeguard a nascent company’s value potential. The importance of this cannot be overstated, given that the harm to a startup’s reputation and brand name can be existential if proper controls are not in place. A recent Forbes CommunityVoice  article  by start-up founder Isaac Kohen offers some helpful starting points for businesses of all sizes to keep in mind. The major takeaways are summarized below, with additional perspective added. Growing a CyberSecurity Culture from Day One. A critical reminder for all is that cybersecurity is not at heart an infrastructure issue—it’s a cultural one....

Data Science Central Monday Digest, July 8

Monday newsletter published by Data Science Central. Previous editions can be found  here . The contribution flagged with a + is our selection for the picture of the week. To subscribe,  follow this link .   Featured Resources and Technical Contributions  Featured Articles Picture of the Week Source: article flagged with a +  From our Sponsors To make sure you keep getting these emails, please add   [email protected]  to your address book or whitelist us. To subscribe, click  here . Follow us:  Twitter  |  Facebook . Views: 214 Tags: < Previous Post Hello, you need to enable JavaScript to use Data Science Central. Please check your browser settings or contact your system administrator. Most Popular Content on DSC To not miss this type of content in the future,  subscribe  to our newsletter. Other popular resources Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More Follow us:  Twitter  | ...

Top 10 Data Science Use Cases in Energy and Utilities

The energy sector is under constant development, and more of significant inventions and innovations are yet to come. The energy use has always been involved in other industries like agriculture, manufacturing, transportation, and many others. Thus these industries tend to enlarge the amount of energy they consume every day. Energy seems to be very demanding in terms of new technologies application and development of new energy sources. The rapid development of the energy sector and utilities directly influences social development. People are now facing challenges of smart energy management and consuming, application of renewable energy sources and environmental protection. Smart technologies play a crucial role in the resolution on these matters. In this article, we attempted to present the most vivid data science use cases in the industry of energy and utilities. Failure probability model  Failure probability model has won its place in the energy industry.  The efficiency of the machine...

How Retailers Use Artificial Intelligence to Innovate Customer Experience and Enhance Operations

Digitalization influences how businesses operate and build and maintain relationships with customers. With the internet open 24/7, consumers can save time and shop online at their convenience. In 2017, global eCommerce sales accounted for 10.2 percent of all retail sales ($2.3 trillion US). This figure is projected to reach 17.5 percent in 2021. Revenue from eCommerce sales is expected to grow to $4.88 trillion US. eCommerce share of total retail sales worldwide from 2015 to 2021. Source: Statista Physical stores still have the lion’s share of sales, but the growing demand for online experiences shouldn’t be ignored. To remain competitive, retailers must allow in-store customers to enjoy the benefits of online shopping. Fast checkout, personalized recommendations, or instant access to customer care at any time are a few services that can be implemented with the help of artificial intelligence. For this article, we discussed current and potential applications of AI in retail, as well as...

Data Quality Case Studies: How We Saved Clients Real Money Thanks to Data Validation

Machine learning models grow more powerful every week, but the earliest models and the most recent state-of-the-art models share the exact same dependency: data quality. The maxim “garbage in – garbage out” coined decades ago, continues to apply today. Recent examples of data verification shortcomings abound, including JP Morgan/Chase’s  2013 fiasco  and this lovely list of  Excel snafus .  Brilliant people make data collection and entry errors all of the time, and that isn’t just our opinion (although we have plenty of personal experience with it); Kaggle  did a survey  of data scientists and found that “dirty data” is the number one barrier for data scientists.   Before we create a machine learning model, before we create a Shiny R dashboard, we evaluate the dataset for a project.  Data validation is a complicated multi-step process, and maybe it’s not as sexy as talking about the latest  ML models, but as the data science consultants of...

Lightweight but effective way of documenting a group of Jupyter Notebooks

My app Qubiter has a folder full of Jupyter notebooks (27 of them, in fact). Opening a notebook takes a short while, which is slightly annoying. I wanted to give Qubiter users the ability to peek inside all the notebooks at once, without having to open all of them. Qubiter’s new SUMMARY.ipynb notebook allows the user to do just that. SUMMARY.ipynb scans the directory in which it lives to find all Jupyter notebooks (other than itself) in that directory. It then prints for every notebook it finds (1) a hyperlink to the notebook, and (2) the first cell (which is always markdown) of the notebook. This way you can read a nice, automatically generated summary of all the notebooks without having to open all of them. If you find a notebook that you want to explore further, you can simply click on its link to open it. Here is the code...

Predicting Hotel Cancellations with Support Vector Machines and SARIMA

Hotel cancellations can cause issues for many businesses in the industry. Not only is there the lost revenue as a result of the customer cancelling, but this can also cause difficulty in coordinating bookings and adjusting revenue management practices. Data analytics can help to overcome this issue, in terms of identifying the customers who are most likely to cancel – allowing a hotel chain to adjust its marketing strategy accordingly. To investigate how machine learning can aid in this task, the ExtraTreesClassifer, logistic regression, and support vector machine models were employed in Python to determine whether cancellations can be accurately predicted with this model. For this example, both hotels are based in Portugal. The Algarve Hotel dataset available from Science Direct was used to train and validate the model, and then the logistic regression was used to generate predictions on a second dataset for a hotel in Lisbon. Data Processing At...

How To Choose An NLP Vendor For Your Organization

Views: 129 Tags: Hello, you need to enable JavaScript to use Data Science Central. Please check your browser settings or contact your system administrator. Most Popular Content on DSC To not miss this type of content in the future,  subscribe  to our newsletter. Other popular resources Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More Follow us:  Twitter  |  Facebook Original link Original author: Data Science Central

USER.NOTIFICATION - WARNING: Notification , recorded at 2019-07-05T06:00:42Z

USER.NOTIFICATION - WARNING: Notification recordedat 07/05/2019 6:00:42 AM UTC Title: Bad Practice Alert Message:[BAD_PRACTICE] Clear Change Flag Shape executing with no documentsin Clear Quota Change Flag(execution-0ebcceda-91b0-40cb-9c9d-d495440bdd63-2019.07.05)Environment: Standard Production Classification: Production Original link Original author: Boomi AtomSphere RSS Feed

How the Mathematics of Fractals Can Help Predict Stock Markets Shifts

The plot is the following: Log returns for Apple stocks. The output from the ADF test is: Augmented Dickey-Fuller test statistic: -28.653611206757994 p-value: 0.0 Critical Values: 1%: -3.4379766581448803 5%: -2.8649066016199836 10%: -2.5685626352082207 In general, we are more likely to reject the null hypothesis, according to which the series is non-stationary (it has a unit root ), the “more negative” the ADF test statistic is. The above test corroborates the hypothesis that the log return series is indeed stationary. The result shows that the statistic value of around -28.65 is less than -3.438 at 1%, the significance level with which we can reject the null hypothesis (see this link for more details). The Hurst Exponent There is an alternative way to investigate the presence of mean reversion or trending behavior in a process. As will be explained in detail shortly, this can be done by analyzing the diffusion speed of the series and comparing...

USER.NOTIFICATION - WARNING: Notification , recorded at 2019-07-04T06:00:45Z

USER.NOTIFICATION - WARNING: Notification recordedat 07/04/2019 6:00:45 AM UTC Title: Bad Practice Alert Message:[BAD_PRACTICE] Clear Change Flag Shape executing with no documentsin Clear Quota Change Flag(execution-aa1e1d7c-1b84-47f9-83ca-0f212d0f475a-2019.07.04)Environment: Standard Production Classification: Production Original link Original author: Boomi AtomSphere RSS Feed

When Your Boss Is an Algorithm

This article was written by Alex Rosenblat in the New York Times Opinion .  There are nearly a million active Uber drivers in the United States and Canada, and none of them have human supervisors. It’s better than having a real boss, one driver in the Boston area told me.,”except when something goes wrong.” When something does go wrong, Uber drivers can’t tell the boss or a co-worker. They can call or write to “community support,” but the results can be enraging. Cecily McCall, an African-American driver from Pompano Beach, Fla., told me that a passenger once called her “dumb” and “stupid,” using a racial epithet, so she ended the trip early. She wrote to a support rep to explain why and got what seemed like a robotic response: “We’re sorry to hear about this. We appreciate you taking the time to contact us and share details.” The rep offered not...

Open-source Logistic Regression FPGA core for accelerated Machine Learning

Machine learning algorithms are extremely computationally intensive and time consuming when they must be trained on large amounts of data. Typical processors are not optimized for machine learning applications and therefore offer limited performance. Therefore, both academia an industry is focused on the development of specialized architectures for the efficient acceleration of machine learning applications. FPGAs are programmable chips that can be configured with tailored-made architectures optimized for specific applications. As FPGAs are optimized for specific tasks, they offer higher performance and lower energy consumption compared with general purpose CPUs or GPUs. FPGAs are widely used in applications like image processing, telecommunications, networking, automotive and machine learning applications. Recently major cloud and HPC providers like Amazon, Alibaba, Huawei and Nimbix have started deploying FPGAs in their data centers. However, currently there are limited cases of wide utilization of FPGAs in the domain of machine learning. Towards this end, InAccel has released...

How Data Science is Playing a Big Role in Higher Education?

Data science is a growing and promising discipline that has impacted various domains, including higher education. Owing to its ability to use precise methods and platforms to extract insights from data, several academic institutions are incorporating data science into their operations and educational curriculum. This helps them engage students, improve educational outcomes and placement records, and boost faculty productivity and research opportunities. Data science and advanced analytics are transforming higher education in more ways than one. Read on to know how! Every educational institution aims at achieving a high completion rate and reducing the number of delayed graduation and dropouts. Predictive models in data science and advanced analytics can help colleges evaluate the high-risk students and take precautionary measures to reduce dropouts. Each student has a distinct personality, learning ability, and socio-economic background. This information can help institutions identify students who are most likely to leave, enabling them to come up...

Workload Optimized Compute Servers Are Creating the Need for Converged Clusters

Pooled, also referred to as “converged”, clusters in a unified data environment support disparate workload better than separate, siloed clusters. Vendors now provide direct support for converged clusters to run key HPC-AI-HPDA (AI, HPC, and High Performance Data Analytic) workloads. The success of workload optimized compute servers has created the need for converged clusters as organizations have generally added workload optimized clusters piecemeal to support their disparate AI, HPC, and HPDA needs. Unfortunately, many disparate clusters operate in isolation, dedicating their resources for specific workloads and managed by humans, essentially placing them in a silo that prevents their benefitting the entire organization. This makes little sense from an operating efficiency perspective as it wastes both operation time plus OpEx (operating expense) and CapEx (capital expense) monies. [i] Most organizations, for example, don’t run their infrastructure for deep learning networks on a 24x7 basis. The part-time nature of these workloads means that...