Friday 25 December 2015

Simple steps to convert video file format

I copied few videos fromDVD to pen drive and wanted to play that on my TV, but sooner I realised it can not be played on my Android TV, so I had to convert those files into MP4 format.
I used VLC player and followed the following simple steps






Friday 31 July 2015

Run a User Recommendation on Mahout

For a beginner like me, the following steps were useful for running user recommendation algorithm of Mahout.

1- I downloaded the file from http://grouplens.org/datasets/movielens/,  the directory contains many files the one which I thought will be useful for me was ratings.data of size 568.3 MB, contianing the fields userId, movieId, rating, timestamp. Mahout's recommenders expect interactions between users and items as input. Every line of the file has the format userID,itemID,value. Here userID and itemID refer to a particular user and a particular item, and value denotes the strength of the interaction (e.g. the rating given to a movie).

(Perform the following steps after login as hduser (user for hadoop cluster))

2- I removed the last field of time stamp from the file (as it was not required for the current recommendation) using the following command and saved it in .csv file
cut --complement -f 4 -d, ratings.data >ratings.csv (To remove 4th column- timestamp from the file)

3- Now I created directory in hadoop file system to store the ratings file using the following command
hadoop fs -mkdir /mahout_data/

4-Now I copied the downloaded file of movie recommendation to hdfs using the following command

hadoop fs -put /home/hduser/mydata/ml-latest/ratings.csv /mahout_data/

5- go to the mahout directory cd /usr/local/mahout/bin/ and issue the following command to run :( The output file should be unique and JAVA_HOME should be properly set)

./mahout recommenditembased -s SIMILARITY_LOGLIKELIHOOD -i hdfs://localhost:9000/mahout_data/ratings.csv -o hdfs://localhost:9000/ratings_test/ --numRecommendations 25

-i hdfs://localhost:9000/mahout_data/ratings.csv - Denotes the input file

-o hdfs://localhost:9000/ratings_test/  -denotes the output file .

recommenditembased - Means we are creating itembased recommendation not user based recommendation, there is a difference between itembased and user based recommendation, a user based recommendation finds similar users , and see what they like, item based recommendation  see what the user likes and find similar items, Mahout's item-based recommendation algorithm takes as input customer preferences by item and generates an output recommending similar items with a score indicating whether a customer will "like" the recommended item.

Choosing a similarity measure for use in a production environment is something that requires careful testing, evaluation and research. For our example purposes, here I used  Mahout similarity classname called SIMILARITY_LOGLIKELIHOOD.

6- It will run for a couple of minutes you can see your output from web interface as well





7- You can check the output file it will contain two columns: the userID and an array of itemIDs and scores.


References :

http://mahout.apache.org/users/recommender/intro-itembased-hadoop.html
http://grouplens.org/datasets/movielens/
http://info.mapr.com/rs/mapr/images/PracticalMachineLearning.pdf

Mahout Installation

I wanted to try Data mining on big data, so I tried installing Mahout for that, here are the steps which I followed for successful installation of Mahout.

Prerequisite to install Mahout is - JDK, Maven and Hadoop cluster.


  • sudo apt-get install maven
  • Download the latest distribution of mahout from the site http://www.apache.org/dyn/closer.cgi/lucene/mahout/
  • unzip and copy this to the desired location
          cp -R /home/surabhi/Documents/Documents/egap/Mahout/mahout/ /usr/local/
  • issue ls to check the packages inside it











  • cd /usr/local/mahout/distribution
  • sudo mvn install (Install maven 3.0.1 or above for Mahout .20 distribution else it will throw some error)
  • Your installation is complete if you see the following screen














Tuesday 7 July 2015

Resolve the installation problem of "rmongodb"

Being a new user of R as well as MongoDB, I wanted to make a mongoDB database connection with R but had to struggle before I could successfully establish the connection.
The version of R what I was using was 3.0.1, being old version whenever I was trying to install install.packages("rmongodb"), I was getting some error. Ultimately I had to upgrade the version of R using the following step and then I was able to install rmongodb package.
  • sudo gedit /etc/apt/sources.list
  • add the following line as I am using the version 14.04.2 (use the command lsb_release -c to see the name)
    • deb http://cran.cnr.berkeley.edu/bin/linux/ubuntu trusty/
  •  gpg --keyserver keyserver.ubuntu.com --recv-key E084DAB9
  • sudo apt-get update
  • sudo apt-get upgrade
Now new version is installed, within R I isuued the command sessionInfo(), which shows the R version 3.2.1 (2015-06-18).

Now use the following command to install mongodb package and connect to MongoDB
  • install.packages("rmongodb")
  • library(rmongodb)
  • To connect to local mongoD
    • mongo <- mongo.create()

Will keep on posting as I proceed towards using R, Shiny and MongoDB.

Thursday 21 May 2015

Resolve the connection problem in mongoDB


I was trying to set up admin password in mongoDB by using admin database, but something went wrong and I started getting  error " couldn't connect to server 127.0.0.1:27017 at src/mongo/shell/mongo.js:145exception: connect failed".
After giving some trials I rectified using the following commands :

Step 1: Remove lock file.
sudo rm /var/lib/mongodb/mongod.lock

Step 2: Repair mongodb.
sudo mongod --repair

Step 3: start mongodb.
sudo start mongodb
or
sudo service mongodb start

Step 4: Check status of mongodb.
sudo status mongodb
or  
sudo service mongodb status

Step 5: Start mongo console.
mongo


Reference : http://stackoverflow.com/questions/12831939/couldnt-connect-to-server-127-0-0-127017/17793856#17793856

Tuesday 10 March 2015

Install Microsoft Office on UBUNTU

I was trying to install Microsoft office 2K7 on my ubuntu 14.10, last time I installed on Ubuntu 12, simply by using wine windows program loader, but this time I was getting some error "newer windows needed". So I had no other option but to try something else .

I installed using playonlinux




  1. sudo apt-get install playonlinux
  2. sudo apt-get install winbind
  3. start playonlinux
  4. You will see the following screen





















    • Click on Install 

    • Choose Micro soft office version which you want to install


































    • Choose the location/ correct path



















    • Enter the licence key the installation will start