Showing posts with label Scansafe. Show all posts
Showing posts with label Scansafe. Show all posts

Monday, March 10, 2008

Apple Mac OSX, Squid and Scansafe

The Scansafe service does not not formally support using squid as your traffic concentrator - they would rather you use the Connector. The Connector of course is Windows only which if you don't use Windows is bad, also the Connector sends the user information as extra encrypted header detail, which is good.

So if you don't use Windows then you can use Squid as your concentrator, but user information (because Sansafe is granular to user level so you want to have user-based and groups based policies) is sent clear.

If you can live with this, then you'll need to modify your squid.conf file.

Heres mine (NB lines may be wrapped):

cache_peer scansafe.tower.ip parent 8080 7 no-query no-digest no-netdb-exchange login=*

#
scansafe.tower.ip - is the tower Scansafe tell you to use
#
parent - you can choose parent or sibling or multicast. Only parent applies
# 8080 - is the port Scansafe use
# 7 - in theory this is not required because we're not doing the ICMP query, but for some reason you need it
# no-query - means squid will not use ICMP query to see if the next hop is available
# no-digest - again goes with not using ICMP
# no-netdb-exchange - this goes with the no-query setting
# login=* - this passes your user details through. Username only, not password!

cache_dir ufs /opt/squid/var/cache 100 16 256

# you'll notice the non-standard structure here. Its due to installing squid on OSX

auth_param basic program /opt/squid/libexec/ncsa_auth /opt/squid/etc/squid_pwd

# this is for the authentication routine. ncsa_auth only support basic clear text so you will see the passwords passing over the wire. The presumption here is that this is for a small network, like mine.

acl no_auth_required dstdomain "/opt/squid/etc/no_auth_required
http_access allow no_auth_required
acl ncsa_users proxy_auth required
http_access allow manager localhost
http_access deny manager
http_access allow ncsa_users
http_access deny !Safe_ports
acl BadSites dstdomain "/opt/squid/etc/bad-sites.squid
http_access deny BadSites
acl no_scansafe dstdomain "/opt/squid/etc/no_scansafe
always_direct allow no_scansafe
http_access deny CONNECT !SSL_ports
http_access allow localhost
http_access allow allowed
http_access deny all
always_direct deny all
never_direct allow all

# All these options allow me control of what is passed to Scansafe, and what sites require authentication. For example, I don't mind access to the Apple updates sites. Likewise if a site s not accessible under any circumstances, then I add it to BadSites.

Sunday, December 30, 2007

Removing Duplicate lines in a file

Like many people I use Squid to:

- save bandwidth through caching

- controlling access to certain sites eg adware / tracking web sites by applying ACLs

A key 3rd purpose for me is to act as a concentrator / connector to an upstream service, Scansafe. Scansafe has the usual lists of web site categories, but the really useful (and desirable) element is that it does in-stream analysis of the pages the user requests.

This is critical because in the new web 2.0 world using lists simply won't work where anyone can upload content. In addition, lists do not work where a trustworthy site hosts link / content that has been compromised eg India Times.

One of the outcomes of this is that I have 2 sources of bad stuff I need to block: analysis of the squid logs, and analysis of the Scansafe realtime reports.

So what I do is copy them into the same file, and then use the following command to remove the duplicates:

awk '!x[$0]++' file > file.new

I got this command from nir_s on unix.com

 

www.hutsby.net

templates borrowed from here, modified by me - IFTU