Parsing Tweets with the TwitterString Class

While building a little Twitter aggregator for an upcoming conference, I found myself in need of a set of methods to create hyperlinks from three distinct elements that can be included within a tweet; links, usernames, and hashtags.

I was able to find regular expressions to do all the heavy lifting from various sources on the Web and have created a class which pretty much does all the processing with one method call. Here’s an example of the thing working, followed by example code, and the class itself.

Example SWF

[SWF]http://inflagrantedelicto.memoryspiral.com/wp-content/uploads/2010/01/TwitterStringDemo.swf, 300, 100[/SWF]


Example MXML



	
	
		
	
	
	

TwitterString Class

package com.fracturedvisionmedia.utils {
	
	/**
	 * The TwitterString class assists with the parsing of a tweet to add hyperlinks 
	 * around Links, HashTags, and UserNames in a tweet.
	 * @author Joseph Labrecque
	 * v. 0.1.2
	 */ 
	
	public final class TwitterString {
		private static var _instance:TwitterString = new TwitterString();
		
		public function TwitterString(){
			if (_instance != null){
				throw new Error("TwitterString can only be accessed through TwitterString.instance");
			}
		}
		
		public static function get instance():TwitterString {
			return _instance;
		}
		
		public function parseTweet(t:String):String {
			var step1:String = parseHyperlinks(t);
			var step2:String = parseUsernames(step1)
			var step3:String = parseHashtags(step2)
			return step3;
		}
		
		private function parseUsernames(t:String):String {
			var result:String = t.replace(/(^|\s)#(\w+)/g, "$1#$2");
			return result;
		}
		
		private function parseHashtags(t:String):String {
			var result:String = t.replace(/(^|\s)@(\w+)/g, "$1@$2");
			return result;
		}
		
		private function parseHyperlinks(t:String):String {
			var urlPattern:RegExp = new RegExp("(((f|ht){1}tp://)[-a-zA-Z0-9@:%_\+.~#?&//=]+)", "g")
			var result:String = t.replace(urlPattern, "$1");
			return result;
		}
		
	}
}

Download TwitterString

8 thoughts on “Parsing Tweets with the TwitterString Class”

  1. Pingback: Parsing Hashtags in Flash « FlyPaper

  2. Hi.

    Thank you for sharing this!
    I needed something like that, and I made a port to haXe.
    I noticed a couple of “mistakes” (don’t qualify as bugs, as it just works as it is).

    1) The method parseHashtags does in fact parse usernames
    2) The method parseUsernames, on the other hand, does parse hashtags.

    Also, I checked Twitter’s own parser, and the parsed hashtags maintain the # character in the search link.

    1. Yeah, thanks. I expect this will be some “interesting” bits in there. It was written in a few hours last year out of immediate necessity :)

      Awesome that you ported to haXe! Is it public anywhere?

Leave a Comment

Your email address will not be published. Required fields are marked *