<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments for jduck.net</title>
	<atom:link href="http://jduck.net/comments/feed/" rel="self" type="application/rss+xml" />
	<link>http://jduck.net</link>
	<description></description>
	<lastBuildDate>Mon, 22 Feb 2010 21:48:18 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>Comment on Scanning with sane&#8217;s scanimage from an ADF scanner to PDF and OCRed Text by Jonah</title>
		<link>http://jduck.net/2008/01/05/ocr-scanning/comment-page-1/#comment-3472</link>
		<dc:creator>Jonah</dc:creator>
		<pubDate>Mon, 22 Feb 2010 21:48:18 +0000</pubDate>
		<guid isPermaLink="false">http://jduck.net/2008/01/05/ocr-scanning/#comment-3472</guid>
		<description>Thanks for the contribution Joe.  I no longer have a scanner with an ADF, but I&#039;m happy to see this is the most popular post on my seldom updated blog.  I&#039;ve added your code as a revision to my original script as a &lt;a href=&quot;http://gist.github.com/311548&quot; rel=&quot;nofollow&quot;&gt;gist&lt;/a&gt; at github for others to work with and fork as they please.</description>
		<content:encoded><![CDATA[<p>Thanks for the contribution Joe.  I no longer have a scanner with an ADF, but I&#8217;m happy to see this is the most popular post on my seldom updated blog.  I&#8217;ve added your code as a revision to my original script as a <a href="http://gist.github.com/311548" rel="nofollow">gist</a> at github for others to work with and fork as they please.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Scanning with sane&#8217;s scanimage from an ADF scanner to PDF and OCRed Text by Joe</title>
		<link>http://jduck.net/2008/01/05/ocr-scanning/comment-page-1/#comment-3467</link>
		<dc:creator>Joe</dc:creator>
		<pubDate>Sun, 21 Feb 2010 12:37:09 +0000</pubDate>
		<guid isPermaLink="false">http://jduck.net/2008/01/05/ocr-scanning/#comment-3467</guid>
		<description>Hi,
thanks for this nice Script.
I&#039;ve done some changes and improvements:

- adding Format to the scanimage batch option (--batch=out%02d.tif)
- compress with zip when called tiff2pdf 
- added ImageMagick Image enhancement (little more contrast) and two-bit Tiff (scans in Gray and reduces colors to 4)

Scanning in Gray have no decrease in speed on my HP 5590.
I&#039;ve also added Posibility to scan more than one dokument (page sequence) with the ADF.
When you have onle one page to scan, its faster not to use the ADF.

So you can call scan2pdf:
scan2pdf myDocument -&gt; scans one page, without the adf (saved as myDocument.pdf)
scan2pdf 99 myDocument -&gt; uses ADF to scan into myDocument.pdf
scan2pdf 3,8,2 myDocument -&gt; uses ADF and scans 3 page sequences, files are saved to myDocument.01.pdf (3 pages), myDocument.02.pdf (8 pages) und myDocument.03.pdf (2 pages)

Maybe someone like it:
[code]

#!/bin/sh

SOURCE=&quot;&quot;

if [ $# -gt 1 ]
then

  SOURCE=&quot;--source ADF -l 3&quot;
  outname=$2
  pbreak=$1

  echo &quot;$pbreak&quot; &#124; egrep &quot;[^0-9,]+&quot;
  if [ $? -ne 1 ]
  then
    echo &quot;Check Sequnence List !&quot;
    exit 1
  fi
else

  pbreak=99
  outname=$1
  SOURCE=&quot;--batch-count=1&quot;

fi

startdir=$(pwd)
tmpdir=scan-$RANDOM

cd /tmp
mkdir $tmpdir
cd $tmpdir
echo &quot;################## Scanning ###################&quot;
scanimage -x 210 -y 297 --batch=out%02d.tif --format=tiff --mode Gray --resolution 300 $SOURCE

start=1
cnt=1
sc=$(echo &quot;$pbreak&quot; &#124; cut -d&quot;,&quot; -f1-99 --output-delimiter=&quot; &quot; &#124; wc -w)
for pb in $(echo &quot;$pbreak&quot; &#124; cut -d &quot;,&quot; -f1-99 --output-delimiter=&quot; &quot;)
do
    ende=$(expr $start + $pb - 1)
    pnr=0
    i=1
    echo &quot;############ Page-Sequence ($cnt), Pages: $pb, Start: $start, End: $ende ############&quot;
    tpages=&quot;&quot;
    for page in $(ls out*.tif); do
	pnr=$(expr $pnr + 1)
	if [ $pnr -ge $start -a $pnr -le $ende ]
	then
	    echo &quot;... Converting&quot;
	    # increase contrast and reduce colordepth 
	    convert $page -level 15%,85% -depth 2 &quot;b$page&quot; 
	    echo &quot;... OCRing&quot;
	    tpages=&quot;$tpages b$page&quot;
	    i=$(expr $i + 1)
	    echo -n &quot;    &quot;
            tesseract $page $page -l deu
            if [ $sc -gt 1 ]
            then
        	cnts=`printf %02d $cnt`
    		cat $page.txt &gt;&gt; $outname.$cnts.txt
    	    else
    		cat $page.txt &gt;&gt; $outname.txt
    	    fi

	fi
    done

    echo &quot;... Converting to PDF&quot;
    #Use tiffcp to combine output tiffs to a single mult-page tiff
    tiffcp $tpages output.tif
    #Convert the tiff to PDF
    if [ $sc -gt 1 ]
    then
    	cnts=`printf %02d $cnt`
        tiff2pdf -z output.tif &gt; $startdir/$outname.$cnts.pdf
	mv $outname.$cnts.txt $startdir
    else
        tiff2pdf -z output.tif &gt; $startdir/$outname.pdf
	mv $outname.txt $startdir
    fi

    start=$(expr $start + $pb)
    cnt=$(expr $cnt + 1)

done

cd ..
echo &quot;################ Cleaning Up ################&quot;
rm -rf $tmpdir
cd $startdir


[/code]</description>
		<content:encoded><![CDATA[<p>Hi,<br />
thanks for this nice Script.<br />
I&#8217;ve done some changes and improvements:</p>
<p>- adding Format to the scanimage batch option (&#8211;batch=out%02d.tif)<br />
- compress with zip when called tiff2pdf<br />
- added ImageMagick Image enhancement (little more contrast) and two-bit Tiff (scans in Gray and reduces colors to 4)</p>
<p>Scanning in Gray have no decrease in speed on my HP 5590.<br />
I&#8217;ve also added Posibility to scan more than one dokument (page sequence) with the ADF.<br />
When you have onle one page to scan, its faster not to use the ADF.</p>
<p>So you can call scan2pdf:<br />
scan2pdf myDocument -&gt; scans one page, without the adf (saved as myDocument.pdf)<br />
scan2pdf 99 myDocument -&gt; uses ADF to scan into myDocument.pdf<br />
scan2pdf 3,8,2 myDocument -&gt; uses ADF and scans 3 page sequences, files are saved to myDocument.01.pdf (3 pages), myDocument.02.pdf (8 pages) und myDocument.03.pdf (2 pages)</p>
<p>Maybe someone like it:<br />
[code]</p>
<p>#!/bin/sh</p>
<p>SOURCE=""</p>
<p>if [ $# -gt 1 ]<br />
then</p>
<p>  SOURCE="--source ADF -l 3"<br />
  outname=$2<br />
  pbreak=$1</p>
<p>  echo "$pbreak" | egrep "[^0-9,]+"<br />
  if [ $? -ne 1 ]<br />
  then<br />
    echo "Check Sequnence List !"<br />
    exit 1<br />
  fi<br />
else</p>
<p>  pbreak=99<br />
  outname=$1<br />
  SOURCE="--batch-count=1"</p>
<p>fi</p>
<p>startdir=$(pwd)<br />
tmpdir=scan-$RANDOM</p>
<p>cd /tmp<br />
mkdir $tmpdir<br />
cd $tmpdir<br />
echo "################## Scanning ###################"<br />
scanimage -x 210 -y 297 --batch=out%02d.tif --format=tiff --mode Gray --resolution 300 $SOURCE</p>
<p>start=1<br />
cnt=1<br />
sc=$(echo "$pbreak" | cut -d"," -f1-99 --output-delimiter=" " | wc -w)<br />
for pb in $(echo "$pbreak" | cut -d "," -f1-99 --output-delimiter=" ")<br />
do<br />
    ende=$(expr $start + $pb - 1)<br />
    pnr=0<br />
    i=1<br />
    echo "############ Page-Sequence ($cnt), Pages: $pb, Start: $start, End: $ende ############"<br />
    tpages=""<br />
    for page in $(ls out*.tif); do<br />
	pnr=$(expr $pnr + 1)<br />
	if [ $pnr -ge $start -a $pnr -le $ende ]<br />
	then<br />
	    echo "... Converting"<br />
	    # increase contrast and reduce colordepth<br />
	    convert $page -level 15%,85% -depth 2 "b$page"<br />
	    echo "... OCRing"<br />
	    tpages="$tpages b$page"<br />
	    i=$(expr $i + 1)<br />
	    echo -n "    "<br />
            tesseract $page $page -l deu<br />
            if [ $sc -gt 1 ]<br />
            then<br />
        	cnts=`printf %02d $cnt`<br />
    		cat $page.txt &gt;&gt; $outname.$cnts.txt<br />
    	    else<br />
    		cat $page.txt &gt;&gt; $outname.txt<br />
    	    fi</p>
<p>	fi<br />
    done</p>
<p>    echo "... Converting to PDF"<br />
    #Use tiffcp to combine output tiffs to a single mult-page tiff<br />
    tiffcp $tpages output.tif<br />
    #Convert the tiff to PDF<br />
    if [ $sc -gt 1 ]<br />
    then<br />
    	cnts=`printf %02d $cnt`<br />
        tiff2pdf -z output.tif &gt; $startdir/$outname.$cnts.pdf<br />
	mv $outname.$cnts.txt $startdir<br />
    else<br />
        tiff2pdf -z output.tif &gt; $startdir/$outname.pdf<br />
	mv $outname.txt $startdir<br />
    fi</p>
<p>    start=$(expr $start + $pb)<br />
    cnt=$(expr $cnt + 1)</p>
<p>done</p>
<p>cd ..<br />
echo "################ Cleaning Up ################"<br />
rm -rf $tmpdir<br />
cd $startdir</p>
<p>[/code]</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Scanning with sane&#8217;s scanimage from an ADF scanner to PDF and OCRed Text by elio</title>
		<link>http://jduck.net/2008/01/05/ocr-scanning/comment-page-1/#comment-2460</link>
		<dc:creator>elio</dc:creator>
		<pubDate>Wed, 19 Aug 2009 12:24:47 +0000</pubDate>
		<guid isPermaLink="false">http://jduck.net/2008/01/05/ocr-scanning/#comment-2460</guid>
		<description>Excellent Jonah. It works very well! Let me add a few annoyances I bumped into, so that other people can take adavntage.
a) You&#039;ve got to discover the name of the device of your scanner. Issuing &gt;scanimage -L will tell it to you. In my case 
elio@gazelle:$ scanimage -L
device `hpaio:/net/Officejet_Pro_L7500?ip=192.168.1.98&#039; is a Hewlett-Packard Officejet_Pro_L7500 all-in-one

b) tesseract installs also country dependant resources. In my Ubuntu 9.04 (english US) it install by default the German files. Go and install a compatible country. I also installed tesseract-ocr-eng

c) on the last step of your program, I couldn&#039;t resolv tiffcp and tiff2pdf. Fixed by installing the package  libtiff-tools

Albeit this note is somehow long I want to state that your solution is very simple and effective. Again, my compliments, I encourage everyone to adopt your solution, It took my five minutes and three tries to be up and running

Still, I have to discover how to scan a double sided document. I&#039;m investigating the command scanimage. I&#039;ll post again if I discover how it should be accomplished
Cheers Elio</description>
		<content:encoded><![CDATA[<p>Excellent Jonah. It works very well! Let me add a few annoyances I bumped into, so that other people can take adavntage.<br />
a) You&#8217;ve got to discover the name of the device of your scanner. Issuing &gt;scanimage -L will tell it to you. In my case<br />
elio@gazelle:$ scanimage -L<br />
device `hpaio:/net/Officejet_Pro_L7500?ip=192.168.1.98&#8242; is a Hewlett-Packard Officejet_Pro_L7500 all-in-one</p>
<p>b) tesseract installs also country dependant resources. In my Ubuntu 9.04 (english US) it install by default the German files. Go and install a compatible country. I also installed tesseract-ocr-eng</p>
<p>c) on the last step of your program, I couldn&#8217;t resolv tiffcp and tiff2pdf. Fixed by installing the package  libtiff-tools</p>
<p>Albeit this note is somehow long I want to state that your solution is very simple and effective. Again, my compliments, I encourage everyone to adopt your solution, It took my five minutes and three tries to be up and running</p>
<p>Still, I have to discover how to scan a double sided document. I&#8217;m investigating the command scanimage. I&#8217;ll post again if I discover how it should be accomplished<br />
Cheers Elio</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Scanning with sane&#8217;s scanimage from an ADF scanner to PDF and OCRed Text by Charles</title>
		<link>http://jduck.net/2008/01/05/ocr-scanning/comment-page-1/#comment-2246</link>
		<dc:creator>Charles</dc:creator>
		<pubDate>Sat, 27 Jun 2009 21:59:53 +0000</pubDate>
		<guid isPermaLink="false">http://jduck.net/2008/01/05/ocr-scanning/#comment-2246</guid>
		<description>Thank you; works nicely and smoothly.  I have added option --batch-start=101 to the scanimage command, because as written the order of pages in the single tiff file is not correct when more than 10 pages are scanned (with 101, you can scan 900 pages).</description>
		<content:encoded><![CDATA[<p>Thank you; works nicely and smoothly.  I have added option &#8211;batch-start=101 to the scanimage command, because as written the order of pages in the single tiff file is not correct when more than 10 pages are scanned (with 101, you can scan 900 pages).</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Using OGR to convert GIS Vector formats by Rizah Murseli</title>
		<link>http://jduck.net/2008/07/18/using-ogr-to-convert-gis-vector-formats/comment-page-1/#comment-1557</link>
		<dc:creator>Rizah Murseli</dc:creator>
		<pubDate>Fri, 12 Dec 2008 08:59:55 +0000</pubDate>
		<guid isPermaLink="false">http://jduck.net/2008/07/18/using-ogr-to-convert-gis-vector-formats/#comment-1557</guid>
		<description>This is a powerful tool of course. It will help me very much!</description>
		<content:encoded><![CDATA[<p>This is a powerful tool of course. It will help me very much!</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Using OGR to convert GIS Vector formats by irimi</title>
		<link>http://jduck.net/2008/07/18/using-ogr-to-convert-gis-vector-formats/comment-page-1/#comment-1534</link>
		<dc:creator>irimi</dc:creator>
		<pubDate>Thu, 27 Nov 2008 15:08:56 +0000</pubDate>
		<guid isPermaLink="false">http://jduck.net/2008/07/18/using-ogr-to-convert-gis-vector-formats/#comment-1534</guid>
		<description>Hi,

This is a powerful tool. I tried to install this in my Debian system, and I don&#039;t have the KML format driver !

How can I fix this ?

Thanks</description>
		<content:encoded><![CDATA[<p>Hi,</p>
<p>This is a powerful tool. I tried to install this in my Debian system, and I don&#8217;t have the KML format driver !</p>
<p>How can I fix this ?</p>
<p>Thanks</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Scanning with sane&#8217;s scanimage from an ADF scanner to PDF and OCRed Text by Nathan</title>
		<link>http://jduck.net/2008/01/05/ocr-scanning/comment-page-1/#comment-1516</link>
		<dc:creator>Nathan</dc:creator>
		<pubDate>Fri, 07 Nov 2008 22:05:49 +0000</pubDate>
		<guid isPermaLink="false">http://jduck.net/2008/01/05/ocr-scanning/#comment-1516</guid>
		<description>Looks interesting. Have you had any luck trying to get the scan button to work in linux?</description>
		<content:encoded><![CDATA[<p>Looks interesting. Have you had any luck trying to get the scan button to work in linux?</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Scanning with sane&#8217;s scanimage from an ADF scanner to PDF and OCRed Text by Links For &#124; Delodder.be</title>
		<link>http://jduck.net/2008/01/05/ocr-scanning/comment-page-1/#comment-1515</link>
		<dc:creator>Links For &#124; Delodder.be</dc:creator>
		<pubDate>Fri, 07 Nov 2008 08:06:10 +0000</pubDate>
		<guid isPermaLink="false">http://jduck.net/2008/01/05/ocr-scanning/#comment-1515</guid>
		<description>[...] Scanning with sane’s scanimage from an ADF scanner to PDF and OCRed Text at Jonah M. Duckles [...]</description>
		<content:encoded><![CDATA[<p>[...] Scanning with sane’s scanimage from an ADF scanner to PDF and OCRed Text at Jonah M. Duckles [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Getting to know PostGIS by dissa</title>
		<link>http://jduck.net/2007/11/06/getting-to-know-postgis/comment-page-1/#comment-1497</link>
		<dc:creator>dissa</dc:creator>
		<pubDate>Fri, 17 Oct 2008 07:10:31 +0000</pubDate>
		<guid isPermaLink="false">http://jduck.net/2007/11/06/getting-to-know-postgis/#comment-1497</guid>
		<description>I am very very new to GIS, This worked for me for U7.10,Postgres8.2.
Thank You.

I Have So Many Questions!!!
Where Is The Query Fun??
And Also How Can I Use This Data Thru QGIS.

Is There Any Other QGIS Like Interface With C/C++ Plugin
Just To Get An Idea.
Please Tell Me Where To Go.
Thanks Again, Saved Me Out Of Lot Of Trouble.
Dissa.</description>
		<content:encoded><![CDATA[<p>I am very very new to GIS, This worked for me for U7.10,Postgres8.2.<br />
Thank You.</p>
<p>I Have So Many Questions!!!<br />
Where Is The Query Fun??<br />
And Also How Can I Use This Data Thru QGIS.</p>
<p>Is There Any Other QGIS Like Interface With C/C++ Plugin<br />
Just To Get An Idea.<br />
Please Tell Me Where To Go.<br />
Thanks Again, Saved Me Out Of Lot Of Trouble.<br />
Dissa.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Getting to know PostGIS by tom</title>
		<link>http://jduck.net/2007/11/06/getting-to-know-postgis/comment-page-1/#comment-1483</link>
		<dc:creator>tom</dc:creator>
		<pubDate>Sat, 11 Oct 2008 02:52:58 +0000</pubDate>
		<guid isPermaLink="false">http://jduck.net/2007/11/06/getting-to-know-postgis/#comment-1483</guid>
		<description>Thanks! This was concise and worked right off the bat for me (note that Postgres is now at 8.3 - just update the number and everything will work fine)</description>
		<content:encoded><![CDATA[<p>Thanks! This was concise and worked right off the bat for me (note that Postgres is now at 8.3 &#8211; just update the number and everything will work fine)</p>
]]></content:encoded>
	</item>
</channel>
</rss>
