Apache Barcamp Spain

October 8th, that’s the date for the first Apache Barcamp Spain, and the place, after Oxford, Sydney and the ApacheCON barcamps, next stop is Seville!

Friday Evening

For those arriving on Friday there will be for sure time for tapas & drinks. We’ll be updating the website as dates get closer.

Saturday

On Saturday we’ll get together and plan the sessions for the day, pure Barcamp style, and FREE as in free beer.

  • YOU decide what talks are given, in an open format
  • Meet developer stars that have already confirmed their attendance
  • No commercial pitches
  • Networking, networking, networking
  • If you want to present/share something this is your chance, just be convincing and get enough votes to get a time slot in one of the tracks

Do you need more excuses to spend a weekend on southern Spain? Keep reading then…

Saturday Evening

You’ll get introduced to the Spanish fiesta by locals, you don’t wanna miss this one, as what happens in Spain… well you won’t remember when you wake up on Sunday anyway 🙂

More info

Working in a distributed team

Reading this post from James Governor No Need to Commute, Ever, that he started from a job offer from Genuitec on twitter, I felt like writing a bit about my experience working on distributed teams.

@genuitec No need to commute, ever. See more of your family, work with talented people: Genuitec is growing, developers apply today

That is exactly the Open Source community model. Back 7 years ago or so, it worked pretty well for me on the OSS projects I participated in. When later I joined Mergere, where we provided services on top of Apache Maven, it was pretty clear that if we wanted the best people we’d need to hire them wherever they where, and the advantage with OSS contributors is that there’s no need for a resume, you can see exactly how their work is. So there was people in the team working in Los Angeles, Sydney, Paris, Florida, Philippines,… a bit painful to get everybody at the same time if we wanted to, as we were all across the world, but that also ensured that the number of meetings and their length are reduced as everybody makes an effort to have offline communication in their best interest.

There’s a big difference between working remotely and a distributed team though. When some of the team members work remotely but a number of them don’t, you have an issue, there are gonna be interactions that happen in person that are not gonna be in the mailing lists, issue tracker, irc,… but when the whole team is distributed (or most of it anyway) all the communication will flow through the same channels, with the added plus that everything is documented and you can go back to previous conversations.

Working at the GooglePlex is nice, sure, but imagine what you can do with all those hours commuting to work, the ability to work from anywhere, eat at home, see your family more often,… does free food make up for that?

And what’s the advantage for the companies? You can reduce expenses, but more importantly the ability to offer employees something they won’t get in most of other companies, who wouldn’t prefer working remotely than having to go to an office? And access to great people over the world instead of a specific area.

Just make sure you take some of the cost savings and use them to get the whole team together as soon as possible, and for a few days from time to time. Getting beers together does help human interactions 🙂

Introduction to Amazon Web Services Identity and Access Management

Using AWS Identity and Access Management you can create separate users and permissions to use any AWS service, for instance EC2, and avoid giving other people your Amazon username, password or private key.

You can set very granular permissions, on users, groups, specific resources, and a combination of them. It will become really complex soon! But there are several very common use cases, that IAM is useful for. For instance having a AWS account for a team of developers.

Getting started

You can go through the Getting Started Guide, but I’ll save you some time:

Download IAM command line tools

Store your AWS credentials in a file, ie. ~/account-key

AWSAccessKeyId=AKIAIOSFODNN7EXAMPLE
AWSSecretKey=wJalrXUtnFEMI/K7MDENG/bPxRfiCYzEXAMPLEKEY

Configure environment variables

export AWS_IAM_HOME=<path_to_cli>
export PATH=$AWS_IAM_HOME/bin:$PATH
export AWS_CREDENTIAL_FILE=~/account-key

Creating an admin group

When you have IAM setup, the next step is to create an Admins group where you can add yourself

iam-groupcreate -g Admins

Create a policy in a file, ie. MyPolicy.txt

{
   "Statement":[{
      "Effect":"Allow",
      "Action":"*",
      "Resource":"*"
      }
   ]
}

Upload the policy

iam-groupuploadpolicy -g Admins -p AdminsGroupPolicy -f MyPolicy.txt

Creating an admin user

Create an admin user with

iam-usercreate -u YOUR_NAME -g Admins -k -v

The response looks similar to this:

AKIAIOSFODNN7EXAMPLE
wJalrXUtnFEMI/K7MDENG/bPxRfiCYzEXAMPLEKEY
arn:aws:iam::123456789012:user/YOUR_NAME
AIDACKCEVSQ6C2EXAMPLE

The first line is your Access Key ID; the second line is your Secret Access Key. You need to save these IDs.

Save your Access Key ID and your Secret Access Key to a file called for instance ~/YOUR_NAME_cred.txt. You can use those credentials from now on instead of the global AWS credentials for the whole account.

export AWS_CREDENTIAL_FILE=~/YOUR_NAME_cred.txt

Creating a dev group

Let’s create an example dev group where the users will have only read access to EC2 operations.

 iam-groupcreate -g dev

Now we need to set the group policy to allow all EC2 Describe* actions, which are the ones that allow users to see data, but not to change it. Create a file MyPolicy.txt with these contents

{
  "Statement": [
     {
       "Sid": "EC2AllowDescribe",
       "Action": [
         "ec2:Describe*"
       ],
       "Effect": "Allow",
       "Resource": "*"
     }
   ]
 }

Now upload the policy

iam-groupuploadpolicy -g dev -p devGroupPolicy -f MyPolicy.txt

Creating dev users

To create a new AWS user under the dev group

iam-usercreate -u username -g dev -k -v

Create a login profile for the user to log into the web console

iam-useraddloginprofile -u username -p password

The user can now access the AWS console at

https://your_AWS_Account_ID.signin.aws.amazon.com/console/ec2

Or you can make life easier by creating an alias

 iam-accountaliascreate -a maestrodev

and now the console is available at

https://maestrodev.signin.aws.amazon.com/console/ec2

About Policies

AWS policy files can be really complex. The AWS Policy Generator will help as a start point and see what actions can be used, but it won’t help you making them easier to read (using wildcards) or applying them to specific resources. Amazon could have provided a better generator tool allowing you to choose your own resources (users, groups, S3 buckets,…) from a easy to use interface and not having to lookup all sorts of crazy AWS identifiers. Hopefully they will be provide a comprehensive tool as part of the AWS Console.

There is more information available at the IAM User Guide.

Update

Just after I wrote this post Amazon has made IAM available in the AWS management console, which makes using IAM way easier.

Finding duplicate classes in your WAR files with Tattletale

Have you ever found all sorts of weird errors when running your webapp because several jar files included have the same classes in different versions and the wrong one is being picked up by the application server?

Using JBoss Tattletale tool and its Tattletale Maven plugin you can easily find out if you have duplicated classes in your WAR WEB-INF/lib folder and most importantly fail the build automatically if that’s the case before it’s too late and you get bitten in production.

Just add the following plugin configuration to your WAR pom build/plugins section. It can also be used for EAR, assemblies and other types of projects.

<plugin>
  <groupId>org.jboss.tattletale</groupId>
  <artifactId>tattletale-maven</artifactId>
  <version>1.1.0.Final</version>
  <executions>
    <execution>
      <phase>verify</phase> <!-- needs to run after WAR package has been built -->
      <goals>
        <goal>report</goal>
      </goals>
    </execution>
  </executions>
  <configuration>
    <source>${project.build.directory}/${project.build.finalName}/WEB-INF/lib</source>
    <destination>${project.reporting.outputDirectory}/tattletale</destination>
    <reports>
      <report>jar</report>
      <report>multiplejars</report>
    </reports>
    <profiles>
      <profile>java6</profile>
    </profiles>
    <failOnWarn>true</failOnWarn>
    <!-- excluding some jars, if jar name contains any of these strings it won't be analyzed -->
    <excludes>
      <exclude>persistence-api-</exclude>
      <exclude>xmldsig-</exclude>
    </excludes>
  </configuration>
</plugin>

You will need to add the JBoss Maven repository to your POM repositories section, or to your repository manager. Make sure you use the repository that only contains JBoss artifacts or you may experience conflicts between artifacts in that repo and the Maven Central repo.

Adding extra repositories is a common source of problems and makes builds longer (all repos are queried for artifacts). What I do is add an Apache Archiva proxy connector with a whitelist entry for org/jboss/** so the repo is only queried for org.jboss.* groupIds.

<repository>
  <id>jboss</id>
  <url>https://repository.jboss.org/nexus/content/repositories/releases</url>
  <releases>
    <enabled>true</enabled>
  </releases>
  <snapshots>
    <enabled>false</enabled>
  </snapshots>
</repository>

New challenges from DevOps: development cycle for your infrastructure

One of the main ideas behind DevOps adoption is the concept of  “infrastructure as code”. Tools like Puppet or Chef allow you to programmatically define your infrastructure, the provisioning of your servers: what packages are installed, what is the content of files,…

If server provisioning is a key point in operations, then code management becomes key too once you start coding your servers. You need source control for your infrastructure, you need tags, versioning, dependencies between components,… You need development, testing, QA, release,… for your infrastructure!

Imagine a environment where you have some server stack running in production using Puppet, with a manifest that defines packages and files in that server, and many servers running the same configuration. That Puppet definition must be in source control.

Now a security fix or new version of package must be installed in all the servers, do you just want to change the manifest and push it out to all the running servers? doesn’t sound like a great idea, does it? Hey, we have been tuning development best practices over the years for use cases just like this one.

What you want to do is create a new branch where you can do that change, and test it in some server that is not in production, let’s call it development environment, original isn’t it?
The change works as expected, your app still works, great! now you can probably merge that branch of the Puppet manifests into trunk, with possibly other changes made by other people, that at some point you will want to test together, in a production-like environment, maybe with several servers in a cluster, load balancing, etc… and very importantly, with the next version of the application that is going to be deployed. You create a new tag and version to be able to identify it later and deploy to that environment, let’s call it QA or staging.

What all this cycle allows you to do is clearly define what is running in each environment, using versions, and easily find issues between deployments, using source control, being able to roll back to known working configuration if needed.

After all, if you deal with infrastructure as code you should use code development best practices, and you’ll get the same benefits.

Javagruppen 2011: Build and test in the cloud slides

Last week spent some good days in Denmark for Javagruppen annual conference as I mentioned in a previous post. It’s a small conference that allows you to cover any question that the attendees have and be able to select what you talk about based on their specific interests.

I talked about creating an Apache Continuum + Selenium grid on EC2 for massively multi-environment and parallel build and test. You can find the slides below, although it’s mostly a talk/visual presentation.

The location was great, in a hotel with spa in Jutland and very nice people and the other speakers too. My advice, go to Denmark, but try to do it in summer 🙂 I’m sure it makes a difference – although it’s pretty cool to be on a hot tub outside at 0C (32F)

And you can find some trip pictures in flickr.

Nyhavn panorama

Nyhavn panorama

GPG, Maven and OS X

GPG on the Mac has been quite an issue always. Several choices to install and hard to configure. Now seems that GPG native tools for OS X are back to life at the GPGTools project, providing a single easy to use installer.

GPGTools is an open source initiative to bring OpenPGP to Apple OS X in the form of a single installer package

So I installed the package, logged out and in again for the PATHs to take effect, and got the agent up and running by executing

gpg-agent --daemon

(It will be automatically started when you restart)

Now, to configure Maven to use this GPG2 version and the GPG agent I added a profile to my ~/.m2/settings.xml

    <profile>
      <id>gpg</id>
      <activation>
        <activeByDefault>true</activeByDefault>
      </activation>
      <properties>
        <gpg.useagent>true</gpg.useagent>
        <gpg.executable>gpg2</gpg.executable>
      </properties>
    </profile>

This way the agent only prompts for the GPG key password once for each session, and Maven uses the right gpg executable.

Hudson and Jenkins, can brands and trademarks affect an open source project?

Maybe you haven’t heard yet (where have you been!), but Hudson, the Continuous Integration server, is being renamed to Jenkins.

Sacha Labourey sheds a bit of light on the background that motivates the changes, but let’s just say that the community wants to move in a direction that Oracle is against, and Oracle claims the ownership of the Hudson trademark, so the community decides to move the source code to another hosting and use a different name for this fork.

So you have on one side Oracle, which AFAIK is doing few contributions to the project, and most of the people doing actual contributions on the other side.

Now, playing devil’s advocate, what if Oracle decides to continue Hudson, by, for instance, just merging all changes to Jenkins back into Hudson. No non-Oracle committers, a bit of development on their side if needed, just a close copy of Jenkins, but with the Hudson brand.

Would that succeed?

Of course you know that the community-endorsed project is now Jenkins. How many people out there would know that too?

Would current users in big corporations be aware of the status of the project?

Would it be possible for a giant like Oracle to use its marketing machinery to promote Hudson as an Oracle product and build some Hudson-branded products on top by just owning the trademark?

Could somebody pull a trick on the open source world as Eddie Murphy does in The Distinguished Gentleman? Vote for Jeff Johnson, the name you know!

I would hope that’s not the case, specially because Hudson doesn’t seem to play a big role in Oracle product map, but if it were otherways… I wouldn’t be so sure.

Speaking at Javagruppen, the Danish JUG annual conference

The guys at Javagruppen, the Danish JUG, are doing their annual conference on February 11th and 12th.

The theme for this year is “Java, a cloudy affair”, and I’ll be speaking on building and testing in the cloud, using Apache Maven, Continuum, TestNG, Selenium,… and how to take full advantage of cloud features for software development, aligned with my previous talks.

This year the conference will be in a 5-star hotel and spa in the middle of Denmark, and I gotta say I look forward to it, seems they know how to choose a location (last year they did it at a Castle).

You can still sign up if you want to go.

Comwell Kellers Park

Github forking after cloning

When you want to change/contribute to other people’s projects that you don’t have access to you usually fork the project, and then use your read+write fork.

What if you first cloned their repo and made local commits that you now want to contribute? You don’t want to mess with patches, so here’s what I did to contribute a small fix to the example project from Apache Maven 2 Effective Implementation, a great book by Brett Porter and Maria Odea Ching.

  1. Fork the project repo at Github (at https://github.com/brettporter/centrepoint/)
  2. In my local clone, I renamed the remote origin to upstream
  3. Add a new remote called origin pointing to the read+write fork
  4. Change the master branch remote to origin instead of upstream
  5. Fetch the remote and push your changes
git remote rename origin upstream
git remote add origin [email protected]:carlossg/centrepoint.git
git fetch origin
git branch --set-upstream master origin/master
git push origin