Contact Us

Name

Email *

Message *

Thursday, 18 December 2014

Data and Text Mining - sentimentsentences dataset

Competency 8.1

Using LightSIDE, sentiment_sentences.csv dataset is loaded and extracted using Unigram feature space on the Feature Extraction panel, and then using Logistic Regression and a 10 fold cross-validation to run experiment.

The experiment is aimed at counting positive and negative words in a review taken from sentiment_sentences dataset to know if the overal review is a good or bad one.


The Model evaluation metric are shown below:

Accuracy = 75.9%

Kappa =  .52


Competency 8.2

To properly leverage on the positive and negative words in the review we added more capabilities to the basic feature extractor such as bigrams and trigrams in our model along with Unigrams. The Model evaluation metric below is slightly better than the baseline model with just unigrams as shown above. 

Accuracy =  76.6%
Kappa = .53

Competency 8.3
When we set the number of features to 3500 we got an accuracy of 76.9% and a kappa of .54% 


Competency 8.5

Using another text category (Movie Reviews.csv) dataset configuring basic features such as Unigrams, Bigrams, Trigrams and punctuation we got an accuracy of  76.3% and a kappa of .45
  





  
  

Wednesday, 10 December 2014

Data and Text Mining - collaborative learning process analysis

Week 7: 

Competency 7.1: Describe prominent areas of text mining.

Unstructured text mining is an area which is seeing a sudden spurt in adoptions for business applications. The spurt in adoption is triggered by heightened awareness about text mining and the reduced price points at which text mining tools are available today. Text mining is being applied to answer business questions and to optimize day-to-day operational efficiencies as well as improve long-term strategic decisions. The objective of this article is to demystify the text mining process and examine its ROI by exploring practical real-world instances where text mining has been successfully applied in three industries:

1.     Automotive industry (warranty management)
2.     Health care industry
3.     Credit card industry

Text Mining in the Automotive Industry

It’s been estimated that warranties cost automotive companies more than $35 billion in the U.S. annually. Considering this tough environment, it is imperative that auto companies explore all opportunities for reducing costs. Optimizing warranty cost is a very important lever in the cost equation for automobile manufacturers. If one is able to get even a marginal improvement in money spent in warranty cost, it can have a multiplier effect on the overall bottom line. One of the most underutilized dimensions of optimizing warranty cost is input from service technicians’ comments. From those comments, the text mining process can surface nuggets of component defect insights yielding interventions for preventing them in future.

Text Mining in the Healthcare Industry

Most countries typically spend anywhere between 3-10% of their GDP on healthcare. The healthcare industry is a huge spender on technology and, with the proliferation of hospital management systems and low-cost devices to log patient statistics, there is a sudden increase in the breadth and depth of patient data. By mining the comments of doctors’ diagnosis transcripts, outputs can yield information that benefits the healthcare industry in numerous ways, such as:
1.  Isolating the top 10 diseases by keyword frequencies per region and leveraging the findings to optimize the mix of tablets/medicines to stock on the limited outlet shelf, keeping in mind the changes in frequency of disease related keywords.
2.  Based on doctors’ comments, an early warning system can be woven within text mining outputs to detect sudden changes to “chatter” from doctors regarding specific diseases. For example, if the frequency of the keyword lungs or breathing exceeds 45 appearances in the last 30 days for a given ZIP code or region, it can be a clue to excessive environmental conditions which are resulting in respiratory problems. A proactive intervention can be activated to remedy the situation.
The components of such a successful text mining solution can be found in Figure 1 below.

                                                                 
Figure 1

 Text Mining in the Credit Card Industry

With the proliferation of credit cards, companies need to do the difficult balancing act of identifying which card features (i.e., line of credit, billing cycle, outlet points and coverage) are resonating with customers and, at the same time, minimize the number of defaults/recovery related interventions. Text mining can help optimize both the collection process as well as the customer experience optimization process.

1.  A top ten complaint keyword watch list can be generated by mining the inbound customer service rep (CSR) call transcripts on a daily basis. From this, you can filter out keywords that were expressed by high-value customers. For example, if the keyword billing error occurs for customers with a credit limit over $200,000, then relationship managers can call the customer and put interventions into the billing process to help prevent reoccurrence.

2.  Text mining can also be used to rate call center staff performance. As an example, a large credit card company in the U.S. had about 600 call center reps receiving inbound calls. Every rep was expected to enter verbose comments to record the nature of the call, but not all were entering detailed text. On one end of the spectrum, there were call center representative entering an average 5 to 6 lines, whereas on the other hand, there were a few who entered just 3 to 5 words. As a result, the organization was missing out on valuable intelligence if only sparse text was recorded. A text mining process was built which gave keyword frequency count by call center representatives. The bottom decile had to undergo additional training to ensure that they entered detailed text, which is valuable for the credit card company. Please see figure 2 below. 

                                                                         

                                                                     Figure 2

In a diverse set of industries ranging from credit cards to auto to healthcare and beyond, the text mining process is slowly being adopted to mine gigabytes of unstructured data. In this tough economic environment, as the pressure to optimize the efficiency of business processes increases, using unstructured text mining techniques on previously ignored data such as comments from technicians, doctors and call center representatives can provide competitive differentiation. This competitive advantage can be in terms of optimizing internal business processes and managing external customer-facing experiences which, in turn, can have a multiplier effect on the overall bottom line. As Marcel Proust said, “The real voyage of discovery consists not in seeking new landscapes, but in having new eyes.” Unstructured data has always been lying around, but never “discovered.” All it takes are “new eyes” within the organization to look at the same unstructured data to gain new bottom-line impacting insights.


Competency 7.2: Detail subareas of text mining such as collaborative learning process analysis.

Data and Text Mining - overview

DM/TM is a technique that consists of applying data analysis and discovery algorithms that, under acceptable computational efficiency limitations produce a particular enumeration of patterns (or models) over the data (Fayyad et al., 1996). Data mining has been directed to search patterns from data set using methods such as neural networks, symbolic machine learning algorithms, probabilistic reasoning, etc. In the symbolic algorithms field, actually, there characteristic the incorporation of background knowledge through labeled examples in unlabeled data set for future learner on unlabeled data. There is not a pre-defined amount of labeled examples that should be inserted in database, however, if one database contains a high number of labeled examples more easy and correct will be its works. The semi-supervised learning was chosen because of its flexibility and accuracy to use incorporated knowledge (ideal state), represented by labeled examples in the data set, and to classify the students’ performance, represented by unlabeled examples, in collaborative process. For each realized classification, it is possible to know its accuracy level and the used patterns for definition of the value. Another reason is the ability to work with an undetermined amount of examples, but it is important to provide a minimum quantity of data.

Competency 7.3: Use tools such as LightSIDE in a very simple way to run a text classification experiment.

Training and evaluating newsgroup topic dataset predictive model
The evaluation was configured to use 20 folds in the cross-validation.

Evaluation metric:

Accuracy = 0.5796 ≈ 57.9%
Kappa = 0.4414 ≈ .44

Competency 7.4: Describe how models might be used in Learning Analytics research, specifically for the problem of assessing some reasons for attrition along the way in MOOCs.

This endeavor (text mining, collaborative learning process analysis) holds the potential for enabling substantially improved on-line instruction both by providing teachers and facilitators with reports about the groups they are moderating and by triggering context sensitive collaborative learning support on an as-needed basis. 


Monday, 1 December 2014

Data, Analytics, and Learning

Competency 6.1: Feature Engineering

Features engineering is an art of creating predictor variables and is the least well –studied part of the process of developing prediction models. It’s clear in feature engineering that models will never be good if their predictions aren't any good.
Some processes of feature engineering are:

1.       Brainstorming features

2.       Deciding what features to create
3.       creating the feature
4.       Studying the impact of features on model goodness
5.       Iterating on features if useful

Competency 6.2: Diagnostic Metrics  

There are various types of Diagnostic metric tools out there, Roc which stands for Receiver- Operator Characteristic Curve.  With Roc, one can predict something which has two values such as
1.       Correct/Incorrect
2.       Gaming the system/not gaming the system
3.       Student dropouts/ Not drop out
Using Roc, prediction models can output probability or even a real value.


Friday, 7 March 2014

Women in Nigeria protest Boko Haram killings in the north


Some women, under the aegis of Nigerian Women Morn, embarked on peaceful protest on Thursday calling for an end to the “mindless killings and abduction” of pupils by the terrorist group, Boko Haram, in the North East region.

On the night of February 24, gunmen believed to be Boko Haram insurgents attacked the town of Buni Yadi in Yobe State killing over 40 people including 29 pupils of the Federal Government College in the town.

The Buni Yadi attack came on the heels of the killing of over 30 students at the College of Agriculture in the same state last September.Protest 3
In Lagos, about 150 women clad in black marched, while singing mournful and protest songs, from the premises of Lagos Television in Agidingbi to the office of the Lagos Governor, Babatunde Fashola, in Alausa.

“We the Nigeria women are deeply concerned with the escalating rate of violence in the North East,” a statement by the protesting women read.

“We are particularly moved by the senseless killings of innocent children in the Federal Government College, Yobe and the abduction of 25 girls from their schools in Boro. We commiserate with the families of the slain children, women and men of Adamawa and Borno States and join in the solidarity to say NO MORE! ENOUGH OF THE KILLINGS!”

The protesting women called on the government to do all it can to stop the orgy of violence.
“We are here to today to prove a point and that point is Nigerian women morn. We are deeply saddened. We are concerned; we are worried. The colossal waste of lives, children are being killed; girls are abducted and used as sex slaves. Just yesterday, elders were rounded up and were killed. Killing has become a recurrent decimal and there comes a time in a nation when women, all of us will rise up in unison and say Enough is enough and we are saying stop the killings.

“We are here to say government must do enough. We must agonise less and organise for change. Our security operatives must be proactive and like we said adequate compensation the government must protect lives and property,” said  the President, Women Arise and Campaign for Democracy (CD), Joe Odumakin.

“Formally we want to say that Nigerian women all over the country irrespective of tribe, irrespective of race, from East West North and South we have the same blood that flows in our veins and we are using this occasion to call on those who are terrorising the land to sheath their swords. We’ve had enough we want to stop tears, we want to stop the agony and we say, Nigeria must survive,” she added.

The convener of Nigerian Women Morn, Laila St. Matthew Daniel, also spoke of the reasons for the protest.

“This is a peaceful non-political, non-tribal protest. We are calling on government to rise up and hear our cries and do something to what is happening to our children, our women and our fathers because it’s not just the children and that why we’ve come out. Women are the matrix of the society; women are the core of the society.

“We just say stop the killing, stop the persecution, stop the genocide. Let or children be because they’re the future of `Nigeria and enough is enough. There comes a time when there is a trigger and the trigger is now that you see all of us together. So we just say to the government, we are tied of the killings, we are tired of the suffering, we are tired of the bombing they should find a lasting solution to this problem,” she said.Protest 2
Another protester and Executive Director of Women Advocates, Abiola Akiyode, also spoke on reasons for the protest.

“Women have come out to ask for peace and to say enough is enough. All the killings must stop. We cannot continue waking up every morning to hear of mass slaughtering. This is genocide. We don’t want to see this anymore and that’s why we are here together to speak with one accord and say no to all these.

“One of the things we are proposing is for a need for us to have a national strategy to address it. It is obvious our security agencies have not done enough, it is obvious the government has not done enough. We cannot continue to die needlessly. These deaths are preventable,” she said.

The Lagos government delegation, which addressed the protesting women, was led by the Head of Service, Josephine Williams; Special Adviser to the Governor on Information and Strategy, Lateef Raji; and the Commissioner for Information, Lateef Ibirogba.

“Your mission this morning is a very noble one. Honestly everybody shares in this pain,” said Mrs. Williams.

“I’m a mother, a grandmother, so I know exactly where you’re coming from and each time we hear of these killings you imagine how everybody stays spellbound to their television and wondering when it would cease.


“Some people have lost husbands; some people have lost wives. Some people have lost their children. Innocent people are being killed and I can feel that pain that goes through each and every one of you. I just want to say that women. Intelligence gathering entails information giving. Unless information comes, sometimes intelligence gathering cannot be as fruitful.”

PDP slams APC Manifestos


The PDP has described the manifestos of the APC as a roadmap to anarchy typical of all anti-democratic coalitions.

Mr Olisa Metuh, the PDP National Publicity Secretary, said in Abuja on Thursday at a news conference that the APC manifesto lacked character and depth.

According to him, the manifesto released by the APC at its just concluded National Summit did not address any issue.

Metuh said that the manifesto ranked security of lives and property low and gave no clue as to the party`s preparedness to tackle terrorism in the country.

He said that the APC had no credible recipe for job creation nor had it shown the strength of character to fight corruption more than what the PDP was doing at the moment.

“The PDP created anti corruption agencies such as the ICPC, the EFCC and established the Freedom of Information Act to further give teeth to the war on graft.

“The Federal Government has shown no preferences in its battle on corruption as senior party leaders have at one time or another been made to face the law on charges of corruption.”

He said that the spirited defence for the suspended CBN Governor, Mallam Sanusi Lamido Sanusi, mounted by the APC was because its leaders benefited immensely from his regime.

The party spokesman further debunked APC `s claim that it was the first political party in the country to launch a code of conduct.

According to him, the PDP in 2006, launched a comprehensive code of conduct under an omnibus entitled “Survival Kit”.

“This kit contained documents such as Desirable Qualities of a Member and Code of Conduct for PDP aspirants and candidates.

“We also have the Peoples Democratic Institute, an intellectual arm of the party whose major work is the systemic research, inculcation and internalisation of democratic ethos,”Metuh said.

He further said that the recent opinion poll which the APC said it derived its manifesto from was a familiar product from political party that subsisted in lies and deception.

“They sponsored same in Anambra and their governorship candidate, Chris Ngige, came a distant third.

“This is after they sponsored same in Ondo where its candidate in the governorship election, Rotimi Akeredolu also came third,” Metuh said.


He, however, warned that the unseen thrust of the APC manifesto was to balkanise the country and cause disaffection among the people. (NAN)

Thursday, 6 March 2014

Gaddafi's son Saadi extradited to Libya

Niger has extradited Muammar Gaddafi's son Saadi to the Libyan capital Tripoli, the Libyan government has said.

The third son of the former Libyan leader is being held in Hadaba Prison in the capital.

“The Libyan Government received today (6/3/2014) Al Saadi Gaddafi. He arrived in Libya and is located at the Libyan judiciary police station", the Libyan government said on its official Facebook page on Thursday.

"The Libyan Government thanks the President of the Republic of Niger, Mahamadou Issoufou, we also thank the Niger Government and the people of Niger for their cooperation with the Libyan Government in pledging its commitment to the treatment of the accused on the principles of justice and international norms in dealing with prisoners. God save Libya.”

Saadi was granted entry to Niger on humanitarian grounds after the Gaddafi government was toppled.

Niger had previously refused to hand over Saadi, who fled south to the West African state in September 2011 as Libyan forces gain the upper hand over his father's forces, because he feared he would  face execution in Libya.

In 2011, Interpol issued a "red notice" asking its member states to arrest Saadi with a view to extradition if they found him on their territory.

In December 2011, Mexican authorities foiled a plot to smuggle Saadi from Niger into Mexico.


Before the revolution, Saadi was best known for captaining Libya's national football team, and making appearances for Italian Serie A sides Perugia and Udinese.


Wednesday, 26 February 2014

Boko Haram kills 29 secondary school students in Yobe

 The JTF confirmed the attack.

Reports from Yobe State indicate that about 29 students of the Federal Government College, Buni Yadi, were killed on Monday night while they slept in their dormitories.

The spokesperson of the Joint Task Force, JTF, in the state, Lazarus Eli, confirmed the attack to Aljazeera network. He however did not give the exact number of casualties.

Mr. Eli said the gunmen “opened fire on student hostels.”

He said details are still sketchy due to lack of telephone access and it is still not clear how many students were affected in the attack.

The outlawed Boko Haram sect is suspected to be behind the attack which took place around 2 a.m.
The sect had carried out a similar attack in Yobe last September killing over 40 students at the College of Agriculture, Gujba.

Also, earlier in July last year, Boko Haram carried out an overnight attack on students of Government Secondary School, Mumoda, killing over 40 students.

Mr. Eli said the military has already dispatched a team to Buni Yadi to track and apprehend the killers.

Yobe, like Borno and Adamawa, has been under emergency rule since May 2013 as the military tries to dislodge the Boko Haram insurgents. Despite the emergency rule, hundreds of people have been killed in different attacks in the affected states.
---------------------------------------------------------------------------------------------------------------------

Did you know you can make extra cash generating traffic for advertising companies with a single click? Follow this link, register http://mediajobnet.com/?id=96730 then redistribute the new link to your friends on social media and get $10+ each time you do that. This is no scam, is real.