The primary new item I show here is that more advanced matching techniques can be used but more importantly all groups are matched even groups that are within another group.
Note: The examples use Scala 2.8. Most examples will work with 2.7 but I believe the last example is Scala 2.8 only.
- scala> val date = "11/01/2010"
- date: java.lang.String = 11/01/2010
- scala> val Date = """(\d\d)/(\d\d)/(\d\d\d\d)""".r
- Date: scala.util.matching.Regex = (\d\d)/(\d\d)/(\d\d\d\d)
- /*
- When a Regex object is used in matching each group is assigned to a variable
- */
- scala> val Date(day, month, year) = date
- day: String = 11
- month: String = 01
- year: String = 2010
- scala> val Date = """(\d\d)/((\d\d)/(\d\d\d\d))""".r
- Date: scala.util.matching.Regex = (\d\d)/((\d\d)/(\d\d\d\d))
- /*
- This example demonstates how all groups must be assigned, if not there will be a matchError thrown
- */
- scala> val Date(day, monthYear, month, year) = date
- day: String = 11
- monthYear: String = 01/2010
- month: String = 01
- year: String = 2010
- scala> val Date(day, month, year) = date
- scala.MatchError: 11/01/2010
- at .< init>(< console>:5)
- at .< clinit>(< console>)
- // but placeholders work in Regex matching as well:
- scala> val Date(day, _, month, year) = date
- day: String = 11
- month: String = 01
- year: String = 2010
- scala> val Names = """(\S+) (\S*)""".r
- Names: scala.util.matching.Regex = (\S+) (\S*)
- scala> val Names(first, second) = "Jesse Eichar"
- first: String = Jesse
- second: String = Eichar
- /*
- If you want to use Regex's in assignment you must be sure the match will work. Otherwise you should do real matching
- */
- scala> val Names(first, second) = "Jesse"
- scala.MatchError: Jesse
- at .< init>(< console>:5)
- at .< clinit>(< console>)
- scala> val M = """\d{3}""".r
- M: scala.util.matching.Regex = \d{3}
- /*
- There must be a group in the Regex or match will fail
- */
- scala> val M(m) = "Jan"
- scala.MatchError: Jan
- at .< init>(< console>:5)
- at .< clinit>(< console>)
The following are a few more complex examples
- scala> val Date = """((\d\d)/(\d\d)/(\d{4}))|((\w{3}) (\d\d),\s?(\d{4}))""".r
- Date: scala.util.matching.Regex = ((\d\d)/(\d\d)/(\d{4}))|((\w{3}) (\d\d),\s?(\d{4}))
- /*
- The Regex has an or in it. So only 1/2 of the groups will be non-null.
- If the first group is a String then it is non-null and the next three elements
- the pattern will be day/month/year
- Otherwise if the 5th group is a String then the patter will be month day, year
- Lastly a catch all
- */
- scala> def printDate(date:String) = date match {
- | case Date(_:String,day,month,year,_,_,_,_) => (day,month,year)
- | case Date(_,_,_,_,_:String,month,day,year) => (day,month,year) // process month
- | case _ => ("x","x","x")
- | }
- printDate: (date: String)(String, String, String)
- scala> printDate("Jan 01,2010")
- res0: (String, String, String) = (01,Jan,2010)
- scala> printDate("01/01/2010")
- res1: (String, String, String) = (01,01,2010)
- /*
- A silly example which drops the first element of the date string
- not useful but this demonstrates that we are matching agains a sequence so
- the _* can be used to match the rest of the groups
- */
- scala> def split(date:String) = date match {
- | case d @ Date(_:String ,_*) => d drop 3
- | case d @ Date(_,_,_,_,_:String,_*) => d drop 4
- | case _ => "boom"
- | }
- split: (date: String)String
- scala> split ("Jan 31,2004")
- res5: String = 31,2004
- scala> split ("11/12/2004")
- res6: String = 12/2004
- /*
- This is just a reminder that the findAllIn returns an iterator which (since it is probably a short iterator) can be converted to a sequence and processed with matching
- */
- scala> val Seq(one,two,_*) = ("""\d\d/""".r findAllIn "11/01/2010" ).toSeq
- one: String = 11/
- two: String = 01/
- scala> val Seq(one,two) = ("""\d\d/""".r findAllIn "11/01/2010" ).toSeq
- one: String = 11/
- two: String = 01/
- // drop the two first matches and assign the rest to d
- scala> val Seq(_,_,d @ _*) = ("""\d\d/""".r findAllIn "11/01/20/10/" ).toSeq
- d: Seq[String] = ArrayBuffer(20/, 10/)
What version of Scala are you using? The last set of examples doesn't work for me in the 2.7.5 REPL. I get the following for the very last example(assuming you meant "d : _*" instead of "d @ _*):
ReplyDeleteerror: ')' expected but identifier found.
And for the two examples prior to that, I get this error:
error: recursive value x$1 needs type
So I wrote it as
val Seq(one:String, two:String) = ...
But then I got this:
error: value toSeq is not a member of scala.util.matching.Regex.MatchIterator
So my first thought was that I might need 2.8 to make this work. Is that true?
I am using Scala 2.8. I will update the post
ReplyDeleteWhat exactly do you mean by:
ReplyDelete/*
If you want to use Regex's in assignment you must be sure the match will work. Otherwise you should do real matching
*/
I am trying to use this type of assignment matching with the RegEx object thinking that a non-match will just not match and continue, but Scala exits and from what I've read, you're not supposed to catch MatchErrors -- does this mean assignment with the RegEx object should only EVER be used if you're certain there will be a match?
i want to find if it a comments /*....*/ how can i find match * chractor?
ReplyDelete@Tong: It is standard Regex stuff nother special with Scala for escaping * and other regex characters. The standard escape is \ but you can use quoting as well like:
ReplyDelete\Qthis is a special quoted regex section *\E
Of course it is better to use triple quoted regexes so you don't have to escape your escapes like in Java
@Eric. If you do somthing like:
ReplyDeleteval RegEx = "(h.*?)".r
val RegEx(hParam) = "what's up"
the previous line will throw an exception because "what's up" doesn't start with an 'h'. If you are not absolutely sure you should do:
val hParam = "what's up" match {
case RegEx(h) => h
case _ => "doesn't match h"
}
to handle the case where the regex doesn't apply.
i tried to what Tong asked...
ReplyDeletehere is my code: ("/\\*.\\*/").r
val s = "/**something*/"
s.r
and my error is Dangling meta character "*" near....
and I fix my String like that:
val s = "/*\\*something*/"
s.r
it works, but i don't know how it work. Can't you explain me, Jesse?
As I said before there is nothing really scala related to this question. If you google about regex in java you will see how you need to do escapes.
ReplyDeleteHowever in this case you are trying to turn your string s into a regular expression. there is no matching going on. If you want to match: /** something */ you might do something like:
scala> val regex = """/\*.*\*/""".r
regex: scala.util.matching.Regex = /\*.*\*/
scala> val Comment = """/\*\*(.*)\*/""".r
Comment: scala.util.matching.Regex = /\*\*(.*)\*/
scala> val Comment(c) = "/** hello */"
c: String = " hello "
Comment is a regex based extractor. I used """ so I don't need to use double \\ for escaping each *. the quotes are around .* because that matches anything. So it will capture every thing between a /** and */.
I want to write st like that
ReplyDeletem = [0-9]*
n = \.[0-9]*
o = mn?
How can I do in Scala?