Looking for a list of all blocked sites in India

  • Thread starter Thread starter Jaydip Das
  • Start date Start date
  • Replies Replies 72
  • Views Views 23,834
Ah, I thought you wanted to compile a list of blocked websites on airtel.
Do you already own a x86 based router? (I can't remember)

PFSENSE now 😎😎😎. But I used to do the same with induvidual sites like leetx on my Mikrotik. My ISP uses Airtel upstream mainly and I get END OF FILE errors even on https with blocked sites.

It takes quite some time for pfblockerng to lookuo 4k sites.
 
Here is very unoptimized(sorry about that) but working code to check what sites are blocked
it will read sites from sitelist.txt and output blocked sites to blocked.txt

Code:
#!/bin/bash
BL="0"
sed -i '/^$/d' sitelist.txt
sed -i 's/ .*$//' sitelist.txt
sed -i 's/ //g' sitelist.txt
LINE=$(wc -l < sitelist.txt)
for (( i = 1; i <= $LINE; ++i ))
do
SITE=$(sed "${i}q;d" sitelist.txt)
curl -L -m 3 $SITE -o tmpfile.txt
EXIT="$?"
grep "airtel.in/dot/" -q tmpfile.txt
EXIT1="$?"
echo
echo
echo
if [[ "$EXIT1" == "0" ]]; then
BL=$((BL + 1))
cat << _site_  >> blocked.txt
$SITE
_site_
echo "$SITE seems to be blocked"
elif [[ "$EXIT" =~ ^(5|6|7|28|35|47|52)$ ]]; then
curl -L -m 2 $SITE
EXIT="$?"
if [[ "$EXIT" =~ ^(5|6|7|28|35|47|52)$ ]]; then
BL=$((BL + 1))
cat << _site_  >> blocked.txt
$SITE
_site_
echo "$SITE seems to be blocked"
fi
else
cat << _sitefree_  >> not-blocked.txt
$SITE
_sitefree_
fi
done
echo
echo
rm tmpfile.txt
echo "$BL out of $LINE Websites found blocked, Please open blocked.txt and not-blocked.txt for respective lists"
 
Last edited:
many blocked websites pinged properly so had to use curl
curl's inbuilt feature to read from file was also not very accurate
Only tested on Ubuntu 20.04 LTS on Raspberry pi 4

just need to add multiple simultanious pings and resume from last state/website on restart etc.
 
Last edited:
Slightly optimized the code, in around 3 minutes i was able to sort 100 websites.
it will save a list of blocked and non-blocked websites as a txt file and tell how many of x websites were blocked
it detected 92 out of 100 websites from the list you provided, from which i can confirm some did actually load (but were mostly ads/domain for sale type)
so it's pretty accurate imo
 
Last edited:
This is the output from your first file (That IFF one)

Source

Out of which 342 were not blocked

Source


it took around 1-1.5 hrs to test around 4300 websites on RPi 4
if you want to run this program, just use any debian based distribution and
Code:
nano blocktest.sh
(Copy paste the code)
save by clocking ctrl + x, then type y then enter
Code:
chmod +x blocktest.sh
./blocktest

it will read from list sitelist.txt
 
@Rehan Kumar
Appreciate your work buddy!
Out of which 342 were not blocked
So, that means the rest 3958 sites were blocked right?
if you want to run this program, just use any Debian based distribution
Yeah I would love to execute the same operation and observe the results wrt my ISP. But unfortunately, I don't have a Raspberry Pi kit which leads to the only way to do this is by checking every individual site manually.
 
Back