Project Ramadan: 1st Prototype

Hello All,

I am quite excited to announce that I have just finished the first prototype of the Quran Database!

I managed to use the Quran.com API using the Requests python library to obtain each word from the Quran with its respective data available using the below loop:

x = 1
n = 0
y = 0
verses = pd.DataFrame()
while x <= 114:
    url = "http://api.quran.com/api/v3/chapters/{}/verses".format(x)
    response = r.get(url).json()
    print('We are in chapter {}\n'.format(x))
    while n < len(response['verses']):
        print('* We are in verse {} under chapter {}\n'.format(n,x))
        while y < len(response['verses'][n]['words'])-1:
            print('** We are in word {} under verse {} under chapter {} \n'.format(y,n,x))
            verses = verses.append(pd.DataFrame(response['verses'][n]['words'][y]))
            y+=1
        n+=1
        y=0
    x+=1
    n = 0
    y = 0

This loop yielded the below table:

With some data cleaning and minor adjustment to improve the representation of data I was able to obtain the below table!

I am aiming on committing all my used steps on Github; unfortunately, as a beginner and an amateur I forgot but nonetheless I will focus on delivering that this upcoming week.

Limitations to the Dataset:

Now there are two main limitations to the dataset:

  1. Some Verses seem to be missing from the API. According to google the Quran has 77,430 words but my dataset only yielded 10,475. I can see that there are some Chapters/Sooras that have not been fully available on the API. I will further investigate this information by reviewing my code; hopefully while committing my work on Github as well as sending the folks at the Quran.com team if I was not able to figure it out myself
  2. I still want to add the Juz Number and the city in which each verse, thus word as well, was delivered. This will only require more work in the loop and use of the API to fill in the data. This should be fairly simple but I think will be better implemented after the I figure out the first limitation.

This will be be my primary work in the upcoming week as well as committing my work on Github. In the meantime, do you think this ‘prototype’ is ready to be submitted on Kaggle?

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s