Archive by Author |

Creating a Bubble Catalogue

In recent weeks, I’ve spent much of my time figuring out how to use all of your drawings to determine where the bubbles are in the Spitzer data. About a month ago we had a breakthrough. Thanks to a lengthy conversation with MWP science guru Matthew Povich, I realised that one of the reasons it is so hard to determine where a bubble should be drawn is that sometimes there is no right answer! There are many bubbles in the MWP that people would disagree on how to draw – the reason is that there is often not necessarily a right answer to the question “where is the bubble?”.

An example of just such a bubble is shown below, with all user drawings shown next to it. You can see that this bubble just isn’t that easy to draw and that there are even two or three structures within the image that one could call a bubble. Instead of trying to make this fit a rigid one-bubble definition, we realised that we should be using the human ability to recognise patterns. After all – this is exactly what you are all so good at, and computers are sometimes not.

Myself and Matthew decided that what we should do in these instances is simply allow two (or even three) bubbles to be deemed as ‘real’. The inner, red structure is a kind of bubble, and so is the open-ended green bubble just outside of it. One could also perceive a third bubble just below and to the left of these, and many people appear to have drawn just that. (This is in addition the multitude of smaller bubbles around the edge, of course). Whatever catalogue is produced by our data reduction, it probably should include at least the first two structures if enough people drawn them.

This decision has made creating a cleaned bubble catalogue much easier. The data reduction process described in my February blog post is still the process I’m using, although it has been greatly refined. More importantly, since February an enormous number of new bubbles have been drawn and this means the averaging process produces better results. Below you can see some results of the latest efforts and hopefully you’ll agree that what is being produced is a good catalogue, based on what you have all drawn. For the sake of testing, I am using one 3-by-2 degree section of the data. This is the region +12 degrees from the galactic centre and contains several interesting and complex features – which makes it a good testing ground.

Below you can see the 3×2 degree tile on its own, with all of your 7,000+ bubbles drawn on top and with the resultant ‘cleaned’ bubbles as well. You can click on any of the images to see the full version.

I have also been looking into other techniques for extracting the bubbles as the crowd sees them. Below you can see just the raw bubble data, drawn by users for this tile. With the background removed, we can use a simple contrast ratio to create a threshold, which we use to cut-out the bubbles from the original image.

This is another method for extracting data, and although it is harder to define a rigid catalogue of bubbles using this method, it may still have use in mapping regions of star formation in our galaxy.

Reducing the Data

I’ve spent much of the past two weeks messing about with different ways to reduce down over 200,000 bubbles (now almost 220,000) into a sensible catalogue. This gets very messy so I will try and explain what I’ve been up to in stages. This is a process called data reduction and for a citizen science, crowd-sourced project like the MWP, it can get complicated. I thought it may interested some of you to see where we currently are in the process of turning your clicks into results.

The key part of the data reduction problem is that we have a very large set of data – the massive number of bubbles that have been drawn – and need to decide which among them are ‘similar’ to each other. We need to keep some flexibility of our definition of similarity because right now, I’m not sure what ‘similar’ means.

Essentially, bubbles are ‘similar’ when two people draw a similarly sized bubble in a similar location. This is something that sounds remarkably easy to say but was hard to do well in code. Comparing 200,000 bubbles to each other is obviously computationally intensive.

Screen shot 2011-02-22 at 10.23.07

In the end I decided that since the size of bubbles was a consideration then I would move across the galaxy, looking on ever-decreasing orders of size. To do this I split the galaxy into 2×2 degree boxes and take each box in turn. In each box I see if there are bubbles here that are of the order of the size of the box (meaning they have a maximum diameter that is between a half- and a whole-box). If there are bubbles on that scale I run a clustering algorithm and pick out groups of these bubbles with central positions clustered to within one quarter of the box size. If a cluster is found, those bubbles are then saved and removed from the whole list. I then divide the box into four and repeat until no bubble are found.

Screen shot 2011-02-22 at 10.22.42

This method means that when a box contains no bubbles, we need not continue down in size scale, but when it does contain bubbles we always split and inspect the four child boxes. In this way we move through the galaxy, in ever-decreasing boxes, but in a fairly efficient manner.

We also have to perform the same analysis with an offset grid. This is exactly the same but making sure we catch bubbles that had fallen on the borders of boxes.

Once we have passed across the galaxy on all size scales, we need to make sure we’ve cleaned up the duplicates created by the offset grid. We do this by considering our newly created list of ‘clean’ bubbles and running through them in order of size. When we find bubbles of a similar size and location they are combined, according to the number of users that drew that bubble. This can be done more easily now that there are far fewer bubbles (in my tests we have dropped to around 5% of the initial number by this stage).

Results

My initial run only looked at bubbles in the longitude range 0-30 degrees. Below are three images, showing one image from the MWP set (one of my favourites as lots of people see it differently). You can the the image, as it is shown to MWP users. Below that you see, overlaid in blue, the original bubbles as drawn by the users. In the third image you can see the same, but this time displaying the ‘cleaned’ results. In the original set the bubbles all have the same opacity, such that when they pile up you can see the similarities. The cleaned set gives the bubbles opacities according to their scores (think more opaque bubbles mean more users drew them).

GLM011680081mosaicI24M1

mwp_test_all_bubbles

mwp_test_clean_bubbles

It should be noted that the cleaned image does not yet display arcs, but rather always shows an entire ellipse. This is because I am not yet including the bubble cut-outs (which you can make out in the middle image) in the data reduction. These will be included at a later time.

You can see that I’m still getting some duplication at the end of the process – I may need to sweep across the final catalogue looking for similar bubbles until I reach a convergence when all bubbles are ‘unique’. I have been experimenting with this with mixed results but will continue my efforts.

If you’re still reading, I look forward to reading your comments. As I continue to make adjustments and progress with this reduction, I shall blog the results again. Many members of the science team are also having a go at this problem and so the final result may be quite different in the end as we improve things. I hope that this is an interesting insight into some of what goes on behind the scenes of the MWP.

Talk Updates

Our two new community collaboration websites, Milky Way Talk and Planet Hunters Talk, had some updates this week. We thought it was worth going over them in this blog post. We’ve had a lot of feedback about Talk and are working to implement the most-requested features.

The biggest difference you’ll see when logging into Talk is that your discussions are now easier to manage and track. A new, large box on the main page shows all the new and updated discussions since your last login. You can refine these using the two drop-down boxes at the top of this section. You can chose to show discussions from the last 24 hours, the last week, or since any date using a pop-up calendar. You can also chose to only see discussions that you are a part of, which should help you keep track of your conversations.

In addition to these changes, you’ll also find a lot more metadata around the discussions, telling you who last posted, how many people are taking part, and who started the discussion, where relevant. Users within these discussions are now highlighted if they are part of the development team or the science team. This is something a lot of you asked for.

Talk Screenshot

The other item that has been changed with this Talk update is pagination. There are now easy-to-use buttons on the discussions, collections and objects on the front page. These mean that you can browse back through time and see more than just the most recent items. As Talk has grown more popular, this feature has become more necessary.

Another change to the front page is that we now show the most-recent items by default, and not the trending items. You can still see the trending items by clicking the link at the top. Users told us they preferred to see recent activity initially so we made the change. Similarly, the ‘trending keywords’ list now appears on the front page at all times.

Finally, page titles are now meaningful. This means that if you bookmark or share a link, you’ll remember why. Collections are named and objects will be title dusing their Zooniverse ID (e.g. AMW….). Several of you have also noted our lack of a favicon (the little icon next to the URL in your browser bar). This is coming shortly as well.

There are more changes planned for Talk, but these significant updates to the front page were worth noting on the blog. For example, we plan to start integrating social media links into the Talk sites, along with more updates as time goes by. Talk continues to evolve and we welcome feedback at team@milkywayproject.org.

Examples of Interesting Objects

GLM_01270-0013_mosaic_I24M1

Feedback from everyone about the Milky Way Project has been overwhelmingly positive. You all seem to love the images and the interface. One thing that is always requested though, is more tutorial examples of the things we’d like you to flag as areas of interest: green knots, dark nebulae, star clusters etc.

We decided it was best to use Talk, the Milky Way Project’s discussion/collections site, to show off examples of the objects you might spot as you draw all over the galaxy. We’ve built collections of green knots, dark nebulae, small bubbles, star clusters, galaxies and fuzzy red objects. The great thing about using Talk to do this is that we can easily add more in as we – or rather you – find them.

All the new example collections were built using the classifications you have made so far. We used your first 100,000 classifications to create lists of the objects most regularly flagged in each category. Hopefully you will find these useful in learning how to spot some of the amazing things that are out there in the Milky Way (and sometimes, beyond)!

A side effect of creating these collections was that I found the image above along the way. It appears to contain green knots, small bubbles, dark nebulae, red fuzzies and a small star cluster. If anyone can see a galaxy in there it’s a full house! You can obviously, also discuss this image on Talk.

If you have comments or suggestions for the Milky Way Project, you can email us on team@milkywayproject.org.

Your Favourite Images

When you’re drawing bubbles, star clusters and everything else all over the Milky Way, you have the option to click a little ‘star’ button to mark an image as a favourite. These are then visible in the ‘My Galaxy’ portion of the site. Primarily this is done to let you keep hold of the images that you like the most. A side effect though is that we can see which images are collectively seen as the best by the Milky Way Project community.

Below you can see the 10 most-favourited images from the Milky Way Project. I’ll let the images speak for themselves. You can click on any of them to jump into Milky Way Talk where you can learn more about them or make a comment. These images also exists as a collection in Talk, where you can also comment and discuss them as a group.




Project Update

MWP-Poster-Small

We are presenting a poster about the Milky Way Project at the 217th Meeting of the American Astronomical Society. This gives us a great opportunity to outline the current status of the project. You can download the poster as either a PDF (2.5 MB) or a big JPEG (14 MB).

During the first four weeks of the project, 10,000 volunteers drew more than 385,000 bubbles, galaxies, clusters and other objects using the site. Volunteers measure the location, diameter, eccentricity and thickness of bubbles, as well as marking any gaps in the bubble’s structure. For other objects, just the location and approximate angular size are recorded.

The public’s individual drawings of objects, such as bubbles, are combined and grouped to produce ‘clean’ catalogues. When the project is complete, both the original and cleaned catalogues will be made public.  At present there are over 100,000 individual bubble drawings, which reduce down to about 60,000 when cleaned. If we consider only those instances where more than 3 individuals agreed that a bubble was present, we have found approximately 5,000 bubbles.

Similarly, after cleaning the data, we have found over 1,000 infrared dark clouds, 596 compact bubbles, 65 star clusters and 5 galaxies.

I’m glad to say that since printing the poster the numbers have already changed – this is because the site continues to have over a thousand images processed each day. We’re now at nearly 115,000 bubbles drawn and 91,000 images served. Check out the main site for the latest figures.

Site Goes Live

example

Today we have launched The Milky Way Project. The Milky Way Project aims to sort and measure our galaxy. Initially we’re asking you to help us find and draw bubbles in beautiful infrared data from the Spitzer Space Telescope. Understanding the cold, dusty material that we see in these images helps scientists to learn how stars form and how our galaxy changes and evolves with time.

As well as drawing out bubbles in our galaxy, we’re also asking you to mark other objects such as star clusters, galaxies and ghostly red ‘fuzzy’ objects. We’re asking you to help us map star formation in our galaxy! Take a look at our tutorial page for the complete run down, with examples.

Interface

Also launching today is the Zooniverse’s new collaboration and community tool: Talk. Milky Way Talk resides at http://talk.milkywayproject.org and there you can find, collect and comment on the objects you see in the Milky Way Project. Every time you classify an image in the Milky Way Project you will be prompted to ‘discuss’ that image via Talk. Talk lets you collect objects together and shares those collections with everybody else. Talk is a brand new feature, developed in-house at Zooniverse HQ. It continues to evolve and change as you use it and we hope that through the Milky Way Project, we can make Talk even better.

Collection in Talk

Don’t forget, you can find us on twitter @milkywayproj and we hope to see you soon on Milky Way Talk!

The Milky Way Project

So after adding in a third entry a couple of days ago, it rapidly ran ahead of the pack on the last day of voting. We had more votes on the final day than in all the time leading up to the decision. But we have a name: The Milky Way Project. Stellar Zoo was a close second, and both beat Milky Way Zoo by some way.

236088main_milkyway516

Over the next few days, this blog will change from ‘Project IX’ to ‘The Milky Way Project’. We have a new twitter feed @milkywayproj and eventually the URL for this blog will also change. I’ll give plenty of warning about that.

The following two weeks involve a big code push here in Oxford, to try and create a beta site for you to try out. There will be more updates soon with a sample of the first science interface on its way…

[Image credit: NASA/JPL-Caltech]

Suggest a Name for Project IX

For some time we have been purposefully calling this ‘Project IX’. This was done to prevent us defining the project before it has really started up. We also thought that you all might have some thoughts on the matter. Hence, we’re asking for suggestions for a name for Project IX.

What we’re not going to call it is ‘Star Formation Zoo’. That is one of the names this project has had internally, but we don’t like it. Sure, Project IX is essentially a Zoo about star formation but this doesn’t feel like a good name. In fact, it doesn’t have to be a ‘zoo’ at all. Solar Stormwatch and Project VII don’t have ‘zoo’ in the title… but I can’t talk about that.

the-dancers

What might inspire a name? Hopefully the science or the imagery – but really anything that comes to mind. We’re dealing with beautiful infrared, dusty images of cold regions of the Milky Way [UPDATE: and also some hotter things like young starts and heated material]. We’re looking at bubbles and young stellar objects – but hoping to spy more. (The Spitzer image above is one I call ‘the dancers’, by the way.)

If you have any ideas on what we can call Project IX, we’d love to hear them. Please leave suggestions in the comments below and we’ll start collating them. If a consensus cannot be reached, we can try a vote of Zooniverse users. Anyone can suggest a name, so if the science team is reading this, feel free to pitch in as well.

UPDATE: After letting the comments run for a while it seems that ‘Milky Way Zoo’ and ‘Stellar Zoo’ are the two more popular names. So here’s a poll – which name should we pick?

[poll id="2"]

The Bubbling Galactic Disk

Some of the most beautiful structures in Spitzer GLIMPSE data are the bubbles. Bubbles are regions of gas, usually found around newly formed stars, often with shells of material surrounding them (the green 8 μm emission above). These appear as rings in the GLIMPSE images and can vary in appearance from strikingly prominent to intriguingly faint. They can be anything from complete circles and ellipses, to fractured, fragmented remains.

As part of Project IX we’re going to ask you to find and measure these bubbles. Researchers can use this information to learn a lot about how these objects form and how they trigger star formation.

RCW120

Above is an image of RCW 120, the titual “perfect bubble” from a 2009 paper by Deharveng, Zavagno, Schuller, Caplan, Pomarès and De Breuck. This colour-composite image shows Hα emission in blue, 8 μm emission in green and the 24 μm emission of small dust grains in red. The image is approximately 24′ degrees wide.

The green material has been swept up as the region expanded, after the formation of a massive star in the centre. There are about 2000 Solar masses of neutral material here, and this has fragmented into lumps. This is where star formation is occurring. The authors of the study found 138 potential star-forming objects in the ring around RCW 120.

In 2006, a group of astronomers visually inspected the GLIMPSE data for bubbles and catalogued their results in a paper titled ‘The Bubbling Galactic Disk‘. I’m a sucker for a great paper title. The team behind this study has been looking at the GLIMPSE data ever since. As mentioned above, bubbles are important features in the study of star formation. Here’s how they are described in ‘The Bubbling Galactic Disk‘:

The study of bubbles gives information about the stellar winds that produce them and the structure and physical properties of the ambient ISM – interstellar medium – into which they are expanding. Additional physical insights include the hydrodynamics of gas and dust in expanding bubbles, the impact of expanding bubbles on magnetic fields in the diffuse ISM, and mass-loss rates during the evolution of stars.

In 2006 322 bubbles were visually identified by just a handful of people. Since that time two things have changed. Firstly, there is now a lot more data, and therefore more bubbles. Spitzer has also continued to map more of the galactic plane for the GLIMPSE360 project – more on that in a later post. Secondly, the Zooniverse now exists!

Everytime the Zooniverse and bubbles have been mentioned together, someone has been there saying that we should get the public to find and measure them. Whether this is between Chris, myself and others at Zooniverse HQ or between Grace Wolf-Chase at Adler Planetarium and various members of the ‘The Bubbling Galactic Disk‘ study. Bubbles and the Zooniverse should be a match made in the heavens.

Why are bubbles such a good target? For many reasons. They are not only amazing to look at, but also are numerous in the GLIMPSE data. They are tricky to measure but not impossibly hard. They are scientifically valuable objects to catalogue and measure the properties of, and they require more than one independent, human measurement to get a good handle on – this is key of course.

Many of the folks behind the ‘The Bubbling Galactic Disk‘ are part of the Project IX science team. We hope that everyone out there in the Zooniverse community can help refine and expand the existing bubble catalogue as part of Project IX. With the addition of new data, we also hope to find many new bubbles.

We are currently developing the bubble tool – the first user interface portion of the Project IX site  - which will have similarities to the Moon Zoo crater tool. We hope to be able to share it with you all soon so that you can help us to test and refine it. It is exciting to be able to involve everyone at this early stage.

If you’re interested in following this project and its take on bubbles, I’d suggest reading ‘The Bubbling Galactic Disk‘ and looking at the GLIMPSE website. If you have any questions – let us know. Bubble are just one thing we can see in the GLIMPSE data. More posts will follow about other scientifically useful objects lurking in this amazing infrared archive.

Follow

Get every new post delivered to your Inbox.

Join 1,343 other followers