Some posts of my old blog
2013-11-28 : Nuova versione dell’applicazione bollettini allerta meteoidrologica
Tre anni fa, per provare Google App Engine, avevo sviluppa un’applicazione che prelevava i bollettini (pdf!) di Allerta Meteoidrologica di Arpa Piemonte, li elaborava e li mandava (gratuitamente) via email agli iscritti. In questi giorni ho rifatto l’applicazione, questa volta su Microsoft Azure: [xxxxxxxx]
Solito avvertimento. Data la natura sperimentale e dimostrativa dell’applicazione, non e’ garantita la correttezza dei dati e la loro regolarita’. I dati sono quindi da considerarsi NON VALIDI. Per avere i dati ufficiali fare sempre riferimento a quelli pubblicati sul sito di ARPA Piemonte all’indirizzo http://www.arpa.piemonte.it

2012-02-17 : Segnalazione articolo: “The Management Team” by Joel Spolsky
The “management team” isn’t the “decision making” team. It’s a support function. You may want to call them administration instead of management, which will keep them from getting too big for their britches. Administrators aren’t supposed to make the hard decisions. They don’t know enough.
http://www.avc.com/a_vc/2012/02/the-management-team-guest-post-from-joel-spolsky.html

2011-08-28 : GDocBackup for Google Apps
Today I’ve published the first release of GDocBackup with support for Google Apps. It’s a very preliminary release with some issues. It’s very easy to use: in the config section, insert the administrator username and password, the domain name and the “OAuth consumer secret” key. Then activate “Google Apps mode” checkbox. GDocBackup will extract all the documents for each user of your domain.
GDocBackup 0.4.40.153 http://code.google.com/p/gdocbackup/downloads/list

2011-02-09 : Piccolo contributo ad un libreria di Google
Vabbe’, lo so e’ proprio una cosa da poco, sono solo 10 righe di codice o poco piu’, ma comunque la cosa mi fa proprio piacere. Un grazie a Claudio. http://code.google.com/p/google-gdata/issues/detail?id=477
2010-11-10 : My first Android test app
My first test on Android. A Memory game for my kids. :)

But it has some bugs. First of all I need to fix the layout when rotated. :(

2010-09-04 : Torino GTUG ?

Questo Giovedi’ (9 settembre 2010) proviamo a vedere se riusciamo a formare un GTUG su Torino. (GTUG = Google Technology User Groups). Se qualcuno e’ interessato, sara’ il benvenuto.
http://torino.gtugs.org/
2010-06-30 : Invio automatico dei bollettini di Allerta MeteoIdrologica di ARPA Piemonte grazie a Google App Engine
ARPA Piemonte pubblica regolarmente vari bollettini meteo e affini. Tra questi trovo molto utile quello di Allerta MeteoIdrologica. Viene pubblicato tutti i giorni attorno alle ore 13. Il bollettino evidenzia i fenomeni meteo rilevanti nelle 36 ore successive ed i previsti effetti al suolo (smottamenti, problemi alla viabilita’, ecc.).

Il territorio piemontese e’ diviso in 11 zone: i fenomeni sono evidenziati in dettaglio per ogni singola zona. Per ogni zona e’ indicato anche un livello di criticita’: ordinaria, moderata ed elevata.

Guardando il bollettino ho pensato: mi piacerebbe essere avvisato in caso di una previsione di eventi critici nella mia zona (forte nevicata, piogge intense, esondazioni). Consultando il sito non ho trovato nessun servizio per l’invio automatico, tanto meno un servizio per l’invio “per zone” e in solo in caso di allerta. Che fare? Soluzione: me lo faccio io il servizio! :)
E’ un po’ di tempo che stavo provando Google App Engine (la piattaforma di Cloud Computing di Google). Mi e’ quindi sembrato ovvio fare 1+1. Ed ecco il risultato:

E’ stata una buona occasione per fare molti esperimenti con i servizi offerti da Google App Engine e non solo.
Un po’ di dettagli tecnici:
- gli utenti si registrano inserendo la loro email e scegliendo le zone per cui ricevere i bollettini. Inoltre posso scegliere se ricevere il bollettino tutti i giorni oppure solo in caso di Allerta vera e proprio.
- per evitare lo spam, ho integrato reCaptcha
- via Scheduled Tasks l’applicazione controlla se viene pubblicato un nuovo bollettino (in pdf)
- i dati testuali del bollettino sono estratti grazie ad una versione modificata di PdfBox che puo’ girare su GAE (vedi [xxxxxxx]). I dati sono quindi ri-organizzati in xml.
- quando l’applicazione trova un nuovo bollettino, invia le email agli utenti (via TaskQueues)
- le informazioni degli utenti e il bollettino attuale (in xml) sono memorizzati nello Storage di GAE
Attenzione: si tratta di un esperimento. I dati potrebbero essere estratti in modo errato e quindi l’applicazione potrebbe inviare avvisi errati. Come ho indicato sul sito, I DATI NON SONO VALIDI. Comunque l’applicazione sta girando da alcuni mesi e si e’ sempre comprata bene. Spero che continui cosi’.
2010-04-20 : PdfBox text extraction & GAE
How to do text extraction from pdf files using PdfBox on Google App Engine
(Warning: I used an old version of PdfBox: 0.7.3.)
PdfBox is a very popular Java library for creating and managing pdf files. It’s also able to extract text from existing pdf files. Pdfbox is published as a jar file. I’d like to use it on Google App Engine (java version) for text extraction from particular area of the page of pdf files. PdfBox allows that. The class to use is PDFTextStripperByArea. I tried it but GAE blocked me: PDFTextStripperByArea uses not allowed JRE classes. In particular jawa.awt.Rectangle and Rectangle2D. GAE applies a “white list” approach: only a subset of the standard JRE classes is allowed to run on GAE. 99% of Java.awt.* is blocked. http://code.google.com/appengine/docs/java/jrewhitelist.html There is also another problem. During text extraction PdfBox uses a temp file. By default it’s created on the file-system. GAE also blocks the access to the file-system.
My solution was:
- use my own Rectangle instead of java.awt.Rectangle
- use a “in memory” temp file The first required modification and recompilation of PdfBox.
My own Rectangle
I created my own Rectangle and Rectangle2D classes. My rectangle implementation is not complete compared to the awt one. I only created fields and methods required. Than I created a new PDFTextStripperByArea: PDFTextStripperByAreaGAE. I not modified the original PDFTextStripperByArea because I didn’t want to break the PdfBox library compatibility. The new class only use my Rectangle. No more references to java.awt. So now GAE allows it to run. The new PDFTextStripperByAreaGAE is equal to the old PDFTextStripperByArea . The only difference is the use of my Rectangle instead of java.awt.Rectangle. I copied and pasted 99% of the original code.
Temp file in memory
PdfBox uses File System by default. But you can force it to use a “in memory” buffer. PdfBox ships with org.pdfbox.io.RandomAccessBuffer. I use it.
byte[] pdfBytes; // contains the bytes of the Pdf file
RandomAccessBuffer tempMemBuffer = new RandomAccessBuffer();
PDDocument doc = PDDocument.load(new ByteArrayInputStream(pdfBytes), tempMemBuffer);
PDFTextStripperByAreaGAE sa = new PDFTextStripperByAreaGAE();
sa.addRegion("Area1", new Rectangle(26, 86, 62, 10));
sa.addRegion("Area2", new Rectangle(99, 86, 94, 14));
...
PDPage p = (PDPage) doc.getDocumentCatalog().getAllPages().get(0); // page 1
sa.extractRegions(p);
String area1 = sa.getTextForRegion("Area1")
String area2 = sa.getTextForRegion("Area2")
...
doc.close();
Live demo
[xxxxxxx]
(please, use small pdf files)
2009-10-30 : Community Torino: partita!
E’ iniziata ufficialmente l’attivita’ della community Torino Technologies Group
Nel primo incontro pubblico abbiamo parlato di Controllo di Versione del codice sorgente e di Virtualizzazione (Vmware, HyperV, Citrix, ecc.). Sul primo argomento ho raccontato la mia esperienza con Subversion. Le slides saranno pubblicate sul sito appena sara’ pronto.
2009-07-23 : Da una vecchia intervista a David Parnas
Altra segnalazione, sempre sullo sviluppo software. Una vecchia intervista (1999) a David Parnas. Da leggere. Un paio di passaggi illuminanti e attualissimi.
Q: What is the most often-overlooked risk in software engineering?
R: Incompetent programmers. There are estimates that the number of programmers needed in the U.S. exceeds 200,000. This is entirely misleading. It is not a quantity problem; we have a quality problem. One bad programmer can easily create two new jobs a year. Hiring more bad programmers will just increase our perceived need for them. If we had more good programmers, and could easily identify them, we would need fewer, not more.
Q: What is the most-repeated mistake in software engineering?
R: People tend to underestimate the difficulty of the task. Overconfidence explains most of the poor software that I see. Doing it right is hard work. Shortcuts lead you in the wrong direction and they often lead to disaster.
2009-06-10 : GDocBackup showed during Google IO
In the last days, Google has published the videos and slides of all the sessions of the last Google IO (May 27-28, 2009, San Francisco). Google IO is the annual developer conference, organized by Google.
Eric Bidelman (Google developer) showed GDocBackup during the session “Building Applications in the Cloud”. Ok, ok, only 30 seconds during a long session (40 minutes) but, despite of that, I’m very happy and proud!
Eric showed it as an example of application that uses the export functionality of Google Documents List Data APIs and does a local backup. Thank you Eric!

(Video time: 23:02)
[ Full video: http://www.youtube.com/watch?v=zZa6bZmGPYA ]
2009-03-17 : GDocBackup - a simple Google Documents Backup utility
Recently Google Documents List APIs Team released a new version of the API. Now you can download documents. Using the new APIs I wrote in .NET a very very simple utility for downloading all my documents. It does a simple backup: for each document in Google Docs, it downloads the document if the document is not present on the local disk or if it was modified.

More info, compiled exe and source code: [xxxxxx]
2008-10-24 : Pdf2AfpLib - alpha release
I’ve released on SourceForge the alpha version of Pdf2AfpLib. http://sourceforge.net/projects/pdf2afplib/
Pdf2AfpLib is a library for converting Pdf files to Afp files. It’s written in C# and uses Ghostscript for a part of the conversion process. At the moment it generates only black and white Afp files. No gray scale, no color. Important: all the content is rasterized. In the resulting afp file, every page contains only one big image that completely covers the page.
The conversion runs through 2 steps:
- Pdf is converted to tiff G3 files using Ghostscript. One tiff for each page.
- Tiff are parsed and the image content is imported as an IOCA image in the output afp.

2008-04-08 : IIS6 StackOverFlow
Sometimes you need to run code with extensive use of stack due to recursive algorithms. But recursion (if very deep) can be very harmful for stack. I encountered such situation, in a particular use of iTextSharp. In my case, IIS 6 (w3wp.exe) crashed with an unknow exception that seemed to be thrown from kernel32.
I spent many hours trying to understand where the problem was. Using WinDbg and AdPlus I understood that the problem came from Net managed code, and after dumping the stack, it was clear which was the method(s).
First: it’s not a library bug. If I reduce the work-load (number of input pdf files) it works perfectly. Second: the same piece of code, with the same input pdf files, runs perfectly from a command-line application. So, the problem is related to the IIS “environment”.
Solution: increase the stack! Ok, but how ? Simple: running the code inside a thread with a bigger stack! On IIS6 the “default” threads are created with 256 KB of stack. I haven’t found a way to change it. But, from your asp.net code you can create and run a new thread with a bigger stack.
Example code (direct run):
protected void Button1_Click(object sender, EventArgs e)
{
this.Run();
}
private void Run()
{
//... code with or calling library with high use of stack
}
Passing trought a “working” thread (with 1 MB of stack):
protected void Button1_Click(object sender, EventArgs e)
{
this.RunAsSeparatedThread();
}
private void RunAsSeparatedThread()
{
Thread t = new Thread(Run, 1 * 1024 * 1024);
t.IsBackground = true;
t.Start();
t.Join();
}
private void Run()
{
//... code with or calling library with high use of stack
}
2007-01-16 : IEC16022Sharp
I have published the first public realease of Iec16022Sharp, a C# library for generating 2D Data Matrix barcode.
http://sourceforge.net/projects/iec16022sharp/
The core of the library is a C# porting of an existing ansi C code currently maintained by Stefan Schmidt (http://www.datenfreihafen.org/projects/iec16022.html).
Misc links
2009-10-26 : Joel Spolsky : Capstone projects and time management : https://www.joelonsoftware.com/2009/10/26/capstone-projects-and-time-management/