Technology | Technology scout

2016-01-06

Practical tips for using map()

When using map() you sometimes can be fooled by Pythons lazy evaluation. Many functions returning complex or iterable data don’t do this directly but return a generator object, which when iterated over, yields the result values.

But sometimes you will need the result set at once. For example when map()ing a list one would sometimes coerce Python to return the whole resulting list. This can be done by applying the list() function to the generator like this:

l=[1,2,3]
l1=map(lambda x: x+1, l)
print(l1)
<map object at 0x10f4536d8>
l1=map(lambda x: x+1, l) 
list(l1)
[2, 3, 4]

In line 5 I have to recreate the map object since print() seems to empty it.

When applying a standard function with map() it’s needed to qualify the module path on call:

l=["Hello  ", "  World"]
l1=map(strip, l)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'strip' is not defined

In this case it’s the str module:

l1=map(str.strip, l)
list(l1)
['Hello', 'World']

Thats all for now. Have fun.

Article

2015-12-10

1 comment

by Volker

Aber du liebst deinen Job doch, oder?!?

Heute würde ich mich gerne kurz mit dem Verhältnis von Arbeitgebern und Arbeitnehmern in Festanstellung beschäftigen. In diesem Bereich greift immer mehr der Usus um sich, all-incl Arbeitsverträge zu machen. Das bedeutet, darin stehen so Formulierungen wie

“Mit dem Grundgehalt gelten alle Überstunden bis zur gesetzlichen Höchstgrenze als abgegolten.”

Wenn man seinen Unmut darüber äußert, daß man Arbeitszeit leisten soll, die dann nicht gesondert vergütet wird, heißt es oft: “Oh, das geht nicht. Da haben wir ja keine Kontrolle über die Kosten!”

Um es einmal klar zu sagen:

Keine Überstunden zu vergüten bedeutet, das unternehmerische Risiko auf den Mitarbeiter abzuwälzen.

Ein Unternehmen nimmt einen Auftrag an und hat dafür eine Vergütung vereinbart, die aufgrund der Angaben des Auftraggebers abgeschätzt ist. Wenn da etwas nicht stimmt, ist das kein Risiko, daß ein Arbeitnehmer tragen sollte. Das ist unredlich.

Man kann das auch von der anderen Seite sehen: mein erster Projektvermittler als Freelancer meinte mal auf meine Einlassung: “Ach, das halbe Stündchen hab ich nicht aufgeschrieben!”:

Ihre Arbeitszeit ist das einzige Gut, daß Sie zu verkaufen haben. Verschenken Sie das nicht. Ihnen wird auch kein Geld geschenkt.

Natürlich sollte man das lieben oder wenigstens gern machen, was man beruflich tut. Und was man gern tut, da schaut man nicht auf die Minute. Aber so lange ich kein Geld für nichts geschenkt bekomme, kann ich keine Arbeitszeit für nichts verschenken. Manus manum lavat nannten das die Römer …

Article

2015-10-28

0 comment

by Volker

Sense and Sensibility of enterprise IT policies

From time to time I come across a sort of dispute or even sometimes war at companies of every size: the central IT department tries to impose a certain hardware or software policy on the coworkers they are entitled to take care of.

Every time this happens there are discussions of BYOD vs. company owned devices. The IT departments claim that they can’t guarantee a certain service level, when they don’t have access to the resources used by the coworkers. The supporters of BYOD argument that using their own chosen hard- and software augments productivity and satisfaction.

I have to confess that I’m a strong campaigner for using my own devices and software at work. But to get some insight into this topic we need to separate different requirements determined by the type of job the employees do:

Office workers need to get things done. With standard tools. They often are happe to have someone to call if things don’t work like expected or needed.
Software engineers use their (mostly) laptops to build software. They need some control over the environment they work in. Libraries, databases, IDEs, operating systems. They choose the tools hat get the job done. When things don’t work they are able to fix problems by themselves.

These two roughly separated requirement profiles are opposed by two sorts of enterprise environment:

Proprietary systems and protocols chosen by the IT departments because they know these systems very well and know how to get support from the provider. Things in this category may contain: Microsoft products (Windows, Exchange, …) or enterprise groupware systems like Novell Groupwise, Lotus Notes etc.
Open protocols and services offer similar options but with a different type of maintenance.

Both approaches require nearly the same amount of maintenance but of different types. Proprietary systems often offer poor support to clients offside of the mainstream. For example have you ever tried to connect an Apple laptop to a Novell file share? Don’t try. You’ll get mad about getting the right client tools, software incompatibilities and stuff like that.

So there is a natural match for BOYD environments: use standardized protocols and services like NFS, SMB (which both have their origin in proprietary systems …) or mail protocols like SMTP and IMAP.

If your users would like to work without tinkering with software or services: use a centralized management system. This doesn’t naturally contain closed source and proprietary tools. But often it does.

For a company with technologically apt users it’s better to adopt the BOYD way to maximize productivity and user satisfaction. The latter often is no valid point with IT service departments. Then it’s the job of the people whose job it is to provide a suitable working environment for happy colleagues to make the service departments to work they way they are supposed to work.

This seems to be a particular problem in Germany where I often enjoy contact to IT service departments featuring a very self-centric philosophy. The notion of being a service department to help others do their job is not very popular.

Several studies show that companies are seen as more attractive to new employees when they allow BYOD policies.

On the other hand there are security considerations to be taken into account. But I don’t know of any company owned system that prevents willful or even lazy security breaches.

Article

2015-09-25

0 comment

by Volker

Note to self: How to count things in Groovy collections

This time I would like to add a short note on how to find things in Groovy collections. Remember: collections is the general term for lists and maps, in other languages sometimes referred to as arrays or dictionaries.

Groovy has a standard method to count all elements of a collection. It is called size():

l=[1,2,33]
println l.size() // yields 3

If you need to know the number of elements in a collection that fit a certain filter, it’s time to switch to count(). Count takes a closure and counts all elements, for which the closure yields true. This can bes as simple as counting all elements larger than 3:

l=[1,2,33]
println l.count { it>3 } // yields 1

Now what, if the elements of the list are objects and I want to filter by a specific feature of the objects. No problem:

class obj {
    def i
    def j
    
    def obj(in_i, in_j) {
        i=in_i
        j=in_j
    }
    
    String toString() {
        return "obj($i, $j)"
    }
}

def a=new obj(1,1)
def b=new obj(1,2)
def c=new obj(1,3)
def list=[a, b, c]

println list.count { it.j>1 } // yields 2, i.e. counts b and c

With maps it’s a bit more tricky. The it object inside the closure is of type LinkedHashMap$Entry, so we have to deal with its key and value attributes:

class obj {
    def i
    def j
    
    def obj(in_i, in_j) {
        i=in_i
        j=in_j
    }
    
    String toString() {
        return "obj($i, $j)"
    }
}

def a=new obj(1,1)
def b=new obj(1,2)
def c=new obj(1,3)
def list=[eins: a, zwei: b, drei: c]

println list.count { it.value.j>1 } // again yields 2

Hope that helps. See you next time!

Article

2015-09-18

4 comments

by Volker

Der Personaler und das Digitale

In seinem Posting “Sehr kritische Gedanken zu Arbeiten 4.0 anlässlich der HR-Fachmesse Zukunft Personal” beschreibt der @Persoblogger recht anschaulich und, wie er selbst sagt, mit einem gehörigen Schuß Ironie, seine Erlebnisse und Gedanken auf und zur Mese Zukunft Personal, die er besucht hat. Diese stand unter dem buzzwordlastigen Motto “Arbeit 4.0”.

Er möchte damit seine These untermauern, daß das Personalwesen, vom Begriff Human Resources möchte ich aus Respekt vor den Beteiligten Menschen nicht reden (dazu am Schluß noch ein paar Sätze), alles andere als digital und schon gar nicht 4.0 sei.

Stilecht vorbereitet hat er sich, indem er sich eine Art Curriculum in einer Excel-Liste zusammen gestellt hat und diese dann ausgedruckt hat. Das sein ja alles sehr anti-digital und doppelte Arbeit. Ja, sehe ich genau so. Der Fehler war schon, ein Excel für diese Aufgabe zu verwenden. Ich würde das mit Evernote machen (in dem ich jetzt gerade übrigens diesen Text schreibe). Und dann hätte ich ihn auf allen Geräten, die ich so mitnehme, dabei. Ohne ihn ausdrucken zu müssen. Witzig oder?

Danach drückt er sein Mißfallen darüber aus, daß viele Teilnehmer während der Vorträge twittern. Menschen seien ja nicht multitaskingfähig. Früher, ja früher, da hätte man noch ordentlich mit Stift und Papier im Publikum gesessen und aufmerksam (1. Tätigkeit) mitgeschrieben (2. Tätigkeit). Merkste was? Spannend auch die Wortwahl. Die twitternden Zuhörer werden gar nicht als Subjekte angesprochen oder beschrieben, nur ihre Tätigkeit. Die aufmerksamen Papier-und-Stift-Zuhörer von früher, das waren Journalisten! Ja, das waren noch Zeiten! Mal abgesehen davon, daß für mich die Berufsbezeichnung Journalist immer mehr zu einem Schimpfwort wird. So wie ich seit Jahren nicht mehr “Consultant” genannt werden möchte.

Jetzt mal im Ernst, Herr Scheller: auf welchen Konferenzen waren Sie denn früher? Auf so eine, wo die Aufmerksamkeit der Zuhörer (und womöglich Journalisten!) gebannt n den Lippen des Referenten hängt, würde ich nämlich auch gerne mal gehen. Ich habe auch ein gewisses Maß an Erfahrung mit Konferenzen und Schulungen, aber eine Aufmerksamkeitsquote von 100% habe ich nirgends erlebt.

Sehr zurecht fragt Herr Scheller, warum ständig neue Säue von der HR-Industrie durchs Recruiting-Dorf getrieben werden. Die Antwort ist inhärent: weil es eine Industrie ist. Und die will verkaufen. Und verkauft wird neuer heißer Scheiß, nicht das solide und gut gemachte. Auch zurecht regt er sich darüber auf, daß Personaler entdeckt haben, daß respektvoller und professioneller Umgang mit Bewerbern tatsächlich etwas bringen könne.

Sehr schön auch seine Einlassungen zu Referenten zur Arbeit 4.0, die nicht mal ihre Powerpoint-Präsentation ohne Technikerhilfe nicht wieder starten können. Richtig klasse finde ich übrigens seine Erläuterung des Abstimmungsprozesses zum “Personalwirtschaftsaward”. Da wurden Pappkärtchen(!) mit QR.Codes(!) verteilt, die man mit dem Telefon scannen mußte, um online(!) seine Stimme abzugeben. Und dann war der lokale Internetzugang überlastet(!). Langsam gehen mir die Ausrufezeichen aus …

Nach der etymologischen Rückführung des Schokoschaumdesserts auf den großmütterlichen Schokoladenpudding schließt der Autor mit einem Appell, zwar offen für neue Ideen zu sein, aber diese immer kritisch zu hinterfragen. Dem ist nichts hinzuzusetzen und ich gehe jetzt mal schauen, wo ich einen Schokoladenpudding her kriege. Schönes Wochenende!

PS: Ach so, ja, da war ja noch das Thema der Begrifflichkeit von Human Resources bzw. warum ich den Begriff für respektlos halte. Resourcen sind im Wesentlichen Waren. Manchmal dinglicher Natur, manchmal immateriell. Menschen sind keine Dinge. Und auch keine Ware. So einfach ist das.

Article

2015-08-10

0 comment

by Volker

Note to self: Crawling the web and stripping HTML and entities on the shell

Ever tried to download a list of strings from a web page? There are numerous solutions to such problems. Here is my sort of a toolbox solution which only uses shell commands. This means it’s scriptable for many sites/urls.

In my case the HTML contained the desired list of strings, each on it’s own line, each surrounded by <b> Tags. So we can filter out all lines not starting with a <b> tag:

curl http://sitename | egrep "^.*" | sed -e 's/<[^>]*>//g' > out.txt

If you try to crawl several sites, the for loop would look like this:

for sitename in site1 site2 site3; do curl http://$sitename | egrep "^.*" | sed -e 's/<[^>]*>//g' > $sitename.txt done

This will leave us with (a) file(s) still containing HTML entities. To strip them from the file you can use a text based HTML browser like w3m:

echo "Hällo" | w3m -dump -T text/html

With our for loop over sites we have several text files which all need to be filtered. Use a “triangle swap” for that:

for sitename in site1 site2 site3; do cat "$sitename.txt" | w3m -dump -T text/html > tmp.txt; mv tmp.txt "$sitename.txt" done

Happy crawling!

Article

2015-08-05

0 comment

by Volker

Numbering lines with Unix

Have you ever had a csv file and wanted to import it into a database? And you would like to add a leading ID column numbered from 0, separated by, let’s say a colon? Here’s a hint: use the Unix pr (for print) utility:

pr -tn, -N0 test.csv | sed -e 's/^[ \t]*//' > new.csv

My test.csv contains a list of all world manufacturer ids (WMI) for car VINs (vehicle identification number). the first few rows look like:

AFA,Ford South Africa AAV,Volkswagen South Africa JA3,Mitsubishi

Please note that column headers are added later on. Now the output looks like this:

0,AFA,Ford South Africa 1,AAV,Volkswagen South Africa 2,JA3,Mitsubishi

Now for the curious: what does the command line do?
First for the pr part:

-t means: omit headers (remember: normally pr is used to print paginated content …)

-n, means: number lines. Use colon as a separator

-N0 means: start with 0

So much for that part. The pr utility normally numbers lines within a given column width (standard is 5 chars). This results in leading whitespace. We don’t want that, so the sed command removes spaces and tabs at the beginning of the line.
Enough Unix magic for now. Happy hacking!

Update: Detlef Kreuz just mentioned on Twitter, that this task could also be accomplished with awk:

awk -e '{print NR-1 "," $0;}' test.csv > new.csv

Here awk executes the commands inside the curly braces for every line of input. Each line will first print the line number minus 1, followed by a colon and the complete line. $0 is an internal awk variable containing the complete currect line, while $1, $2 … contain the split up fields (where to split is determined by FS, the field separator, which defaults to a space). Thanks Detlef!

Article

2015-07-31

0 comment

by Volker

Note to self: How to use screen

This posting will start a series of rather short articles, where I present things that I use from time to time but tend to forget how to do it :)
The first serving will deal with the undeniable useful Unix command screen. Screen can open a virtual screen, there you can start running long term processes and you can detach at any time and reattach later, while the process continues to run. You can view screen as a nohup on steroids. Start it with a blank shell and create a session with the symbolic name testo:

screen -S testo

You are greeted with … well, a fresh and clean shell. Here you can start doing things that will run a long time. To detach from that screen, use the key sequence ctrl-a d. Nearly all key sequences for screen start with crtl-a. And the “d” stands for “detach”. To see whats going on behind your back, use the screen list command:

screen -ls There is a screen on: 1387.testo (31.07.2015 17:34:57) (Detached) 1 Socket in /var/run/screen/S-vmg.

Here 1387.testo is the key to the session, consisting of the process id and the symbolic name:

ps auxf … 1387 ? Ss 0:00 SCREEN -S testo 1388 pts/2 Ss+ 0:00 \_ /bin/bash

To reattach to the screen, you might have guessed it, you can use a screen reattach:

screen -r testo

You can detach and reattach to the screen as often as you like. When done with your long running processes, just log out of the screen using ctrl-d. You will be informed that the screen has been shut down:

[screen is terminating]

Article

2015-07-14

0 comment

by Volker

Playing around with services in grails console

Suppose you have a grails project and have witten a service doing some database magic to pull together data. Now suppose the very unlikely case that it’s not running that smooth than you thought. To expel the black magic you probably would like to use the grails console to play around with your domain classes and services. Using a service is as simple as importing the domain class and using it:

import myproject.Domainclass def instance=Domainclass.get(3) println instance.id+"\t"+instance.name

The service classes however are not that accessible to manipulation. You need to request the service bean instance by name from the application context named ctx:

def mcs=ctx.getBean("myCoolService") def allThings=mcs.getAllThings()

Remember to use the (lowercase) instance name when calling getBean() just as it would be injected into your controller:

class GraphController { MyCoolService myCoolService }

Pulling the strings together you can do more complex tests:

import myproject.Domainclass def mcs=ctx.getBean("myCoolService") def instance=Domainclass.get(3) println instance.id+"\t"+instance.name println "-----------------------------------------" def allThings=mcs.getAllThings() allThings.each { n -> println n.id+"\t"+n.thingstype+"\t\t"+n.name }

Hope that helps. As always: in case of questions or corrections / additions please leave a comment :)

Article

2015-07-03

6 comments

by Volker

Adding assets in Grails 3

When using modern web development technologies, you often come across frameworks or libraries which use additional resources apart from css stylesheets, images and javascript. One such example is Font Awesome, which needs sone font files, located in the /fonts subdirectory of the unzipped package. In Grails 2 lazy coders would put this directory in the /wep-app folder. In Grails 3 you should (!) use the asset pipeline for these files to and here are two ways that work:

Simply put the files into the grails-app/assets/stylesheets folder. This is not a very elegant way nor is it the intended way to use the asset pipeline.

Put the fonts directory parallel to stylesheets, images and javascript into the grails-app/assets/ folder. For the asset pipeline to know the new directory, specify it in the build.gradle file:
assets { minifyJs = true minifyCss = true includes = ["fonts/*"] }

Last thing to do is to patche the font file paths in the font-awesome.css and/or font-awesome.min.css file. Just remove the “../fonts/” part of the url() path, so they all look like this:

font-family: 'FontAwesome'; src: url('fontawesome-webfont.eot?v=4.3.0');

Thats all.

This post by David Estes put me on the right track, since the official documentation doesn’t mention Grails 3 issues. Thanks David!

Technology scout

Finding a way through

All Posts Filed in ‘Technology’

Practical tips for using map()

Aber du liebst deinen Job doch, oder?!?

Sense and Sensibility of enterprise IT policies

Note to self: How to count things in Groovy collections

Der Personaler und das Digitale

Note to self: Crawling the web and stripping HTML and entities on the shell

Numbering lines with Unix

Note to self: How to use screen

Playing around with services in grails console

Adding assets in Grails 3