Wednesday, 26 October 2011

Note FOR Ruby 1.9 之文件,序列化,正则表达式

Ruby之文件

  Ruby中文件相关主要有三个类:IO,File,Dir,FileUtils,其中File继承于IO
IO:foreach, readlines
File:
  • 文件操作:new, open, puts, close, eof......
  • 文件模式:r, r+, w, w+, a, a+, b
  • 判断文件目录是否存在:File.exist?
  • 判断是否目录:File.directory?
  File使用block方式open时,在block结束会自动关闭
File.open('test.txt') { |f|
     p f.gets
}

Dir:foreach
FileUtils: cp, mv, rm, mkdir

Ruby之序列化

  Ruby中序列化有YAML和Marshal,其中Marshal是标准库
YAML: 需要require 'yaml'
  任何对象使用to_yaml方法进行序列化,其中---代表一个新的YAML document,-代表一个元素
["a", "b"].to_yaml
y(["a", "b"])
---
- a
- b

  YAML.dump, YAML.load方法
  to_yaml_properties:选择某些属性进行序列化,其他则忽略
class Yclass
    def initalize(aNum, aStr, anArray)
        @num = aNum
        @str = aStr
        @arr = anArray
    end
    
    def to_yaml_proerties
        ["@num", "@arr"]
    end
end

  这样就只有@num,@arr会序列化
  load_documents加载多个YAML文档
Marshal:同样有dump,load方法,但是某些对象不能够序列化,如果对象有binding,procedure, method,IO的实例,或者singleton对象,在dump时会产生TypeError异常;Marshal中定义marshal_dump来决定哪些属性需要序列化,定义marshal_load在load调用
class Mclass
    def initialize(aNum, aStr, anArray)
        @num = aNum
        @str = aStr
        @arr = anArray
    end
    
    def marshal_dump
        [@num, @arr]
    end
    
    def marshal_load(data)
        @num = data[0]
        @arr = data[1]
        @str = "default"
    end
end

ob = Mclass.new(100, "baron", [1, 3])

  singleton对象的序列化,YAML能够对singleton对象进行序列化,但是load以后就不再是singleton对象了,可以在load之后再将对象转化成singleton对象,例如:
def makeIntoSingleton(someOb)
    class << someOb
        ...
    end
    return someOb
end

  Marshal的版本兼容:Marshal序列化的数据有major,minor号,为存储的第一个和第二个字节,这个和Ruby版本无关,只有相同major,且minor低于时才能load.Marshal提供了两个常量:MAJOR_VERSION和MINOR_VERSION。

Ruby之正则表达式

  Ruby中测试一个正则表达式用=~, 如果匹配,返回匹配字符的位置,如果不匹配,返回nil:
puts /abc/ =~ 'abc'
  创建一个正则表达式有如下几种方法:
  • Regexp.new('^[a-z]*$')
  • /^[a-z]*$/
  • %r{^[a-z]*$}
  match group:
/^(\s*)#(.*)/ =~ ' #comment'
puts $1 << "//" << $2
  sub:
' #comment'.sub(/^(\s*)#(.*)/, '\1//\2')
  match:和=~不同,其返回值为一个MatchData对象
x = /(^.*)(#)(.*)/.match('def myMethod # This is a method')
x.captures.each{ |item| puts(item) }
  其中用to_a或captures返回一个数组,不过to_a第一项为原始字符串。
  如果match使用了group,还可以这样用,第一项为原始字符串:
puts(/(.)(.)(.)/.match("abc")[2])
  pre_match(或者用$`), post_match(或者用$'):
x = /#/.match('def myMethod # This is a method')
puts x.pre_match
puts $`
puts x.post_match
puts $'
  $~代表最后一个MatchData对象
  *和+是greedy match,在其后加上?为最小匹配
  字符串的scan方法找出所有匹配,并且可以传递一个block,字符串有很多方法使用正则表达式,例如slice, slice!, split, sub!, gsub, gsub!
  正则表达式总结:
^beginning of a line or string
$end of a line or string
.any character except newline
*0 or more previous regular expression
*?0 or more previous regular expression (non greedy)
+1 or more previous regular expression
+?1 or more previous regular expression (non greedy)
[]range specification (e.g. [a-z] means a character in the range ‘a’ to ‘z’)
\wan alphanumeric character
\Wa non-alphanumeric character
\sa whitespace character
\Sa non-whitespace character
\da digit
\Da non-digit character
\ba backspace (when in a range specification)
\bword boundary (when not in a range specification)
\Bnon-word boundary
*zero or more repetitions of the preceding
+one or more repetitions of the preceding
{m,n}at least m and at most n repetitions of the preceding
?at most one repetition of the preceding
|either the preceding or next expression may match
()a group

No comments :

Post a Comment