Information retrieval document search using vector space model in R


Note, there are many variations in the way we calculate the term-frequency(tf) and inverse document frequency (idf), in this post we have seen one variation. Below images show as the other recommended variations of tf and idf, taken from wiki. Mathematically, closeness between two vectors is calculated by calculating the cosine angle between two vectors. In similar lines, we can calculate cosine angle between each document vector and the query vector to find its closeness. To find relevant document to the query term, we may calculate the similarity score between each document vector and the query term vector by applying cosine similarity .

Anybody working related to Law - Use cases in Legal Industry


There are numerous usecases in Legal industry, especially to review briefings, document handling is major segment where data science (Text classification, document segmentation etc) will help. Other use cases are OCR and Document generation like narrative science etc. IBM Watson has specific features/ usecases for legal bots.

Drunk Tesla Driver Arrested For DUI Despite Self-Driving Car

International Business Times

A Bay Area man who took advantage of one of the marvels of modern technology was arrested on the afternoon of Jan. 13. The San Francisco branch of the California Highway Patrol tweeted out this image and accompanying caption Friday, showing an unidentified man who got arrested for a DUI after he was caught passed out behind the wheel of a self-driving Tesla. When u pass out behind the wheel on the Bay Bridge with more than 2x legal alcohol BAC limit and are found by a CHP Motor. Driver explained Tesla had been set on autopilot. He was arrested and charged with suspicion of DUI.

Could intelligent machines of the future own the rights to their own creations?


Intellectual property may be the legal term for creations, including literary or artistic, but there is something inherently human about it as well. It has long been taken that only human beings are capable of being intelligent in its fullest form, and the concept of intellectual property strives to protect the product of such human intelligence. This is reflected in a number of intellectual property laws.

The accuracy, fairness, and limits of predicting recidivism


We are the frequent subjects of predictive algorithms that determine music recommendations, product advertising, university admission, job placement, and bank loan qualification. In the criminal justice system, predictive algorithms have been used to predict where crimes will most likely occur, who is most likely to commit a violent crime, who is likely to fail to appear at their court hearing, and who is likely to reoffend at some point in the future (1). One widely used criminal risk assessment tool, Correctional Offender Management Profiling for Alternative Sanctions (COMPAS; Northpointe, which rebranded itself to "equivant" in January 2017), has been used to assess more than 1 million offenders since it was developed in 1998. The recidivism prediction component of COMPAS--the recidivism risk scale--has been in use since 2000. This software predicts a defendant's risk of committing a misdemeanor or felony within 2 years of assessment from 137 features about an individual and the individual's past criminal record.

Report: China Beating US in Artificial Intelligence Arms Race


Growing investment and greater access to data about people is giving China the edge in the race to develop artificial intelligence, according to news reports Thursday. "The Chinese government's thoughtful investment in AI is a huge accelerator," Andrew Ng, a former Google executive who now runs an AI start-up company, told The Wall Street Journal. Western concerns about privacy and "the ethics of letting machines make important decisions" have led to new policies over collecting personal data and deploying AI, the Journal reports. "Ultimately, AI advances are inextricably founded on the broad use of data to train machines as they go about learning," said Brad Smith, Microsoft's president and chief legal officer. "The question is whether privacy laws will constrict AI development or use in some parts of the world," he added.

3Q: D. Fox Harrell on his video game for the #MeToo era

MIT News

The Imagination, Computation, and Expression Laboratory at MIT's Computer Science and Artificial Intelligence Laboratory has released a new video game called Grayscale, which is designed to sensitize players to problems of sexism, sexual harassment, and sexual assault in the workplace. D. Fox Harrell, the lab's director, and students in his course CMS.628 (Advanced Identity Representation) completed the initial version of the game more than a year ago, and the ICE Lab has been working on it consistently since. But it addresses many of the themes brought to the fore by the recent #MeToo movement. The game is built atop the ICE Lab's Chimeria computational platform, which was designed to give computer systems a more subtle, flexible, and dynamic model of how humans categorize members of various groups. MIT News spoke to Harrell, a professor of digital media and artificial intelligence, about Grayscale (or to give it its more formal name, Chimeria:Grayscale). Q: How does the game work?

Machine learning tools for fairness, at scale


Check out the machine learning sessions at the Strata Data Conference in London, May 21-24, 2018. Hurry--best price ends February 23. The problem of fairness comes up in any discussion of data ethics. We've seen analyses of products like COMPASS, we've seen the maps that show where Amazon first offered same-day delivery, and we've seen how job listings shown to women are skewed toward lower-paying jobs. We also know that "fair" is a difficult concept for any number of reasons, not the least of which is the data used to train machine learning models.

Surviving AI in the Law Firm: Be the One Asking Questions, Not Collecting Answers


There are a lot of frightened lawyers out there, scared that artificial intelligence will gobble up their jobs. Some lawyers are right to be scared: the ones who don't do enough thinking while they make their living. Think of all times when you're on the phone with a customer service person and are getting an answer that makes no sense to you (but seems perfectly fine to him). That rep who can only explain his company's policy with, "That's what the computer is saying," is like the lawyer whose job is doomed. For a bright employment future, you want to be the lawyer who looks at the answers AI produces, not just the one who asks the computer questions.

Trusting Machines Requires Transparency


Pamela Pavliscak is Committee Member, IEEE Global Initiative for Ethical Considerations in the Design of Autonomous Systems, visiting assistant professor at Pratt Institute's School of Information and founder of insights and innovation firm, Change Sciences. Her work focuses on emotion and identity in the context of people's personal experiences with algorithms. In this Q&A, she explores some of the key considerations and challenges for algorithmic decision making, which will be discussed further in a complimentary webinar hosted by IEEE on 7 December. Question: What is algorithmic decision making? Pavliscak: An algorithm is a set of instructions used by machines to problem-solve or predict.