Tuesday, September 2, 2008

Finished RbYAML on GSoC

After about four months job, I almost finished the goal of rbyaml for GSoC 2008.
Really thanks my mentors Xueyong Zhi and Ola Bini. They helped me a lot on technical stuff and requirements from community. It's my first opensource project, a pretty fantastic experience.
Now, rbyaml supports more YAML1.1 standard and compatible with more YAML1.0 standard than before. The default rbyaml interpreter is for YAML 1.1. You could also use YAML 1.0 interpreter by require 'rbyaml_1.0'.
But the goal which totally compatible with Syck is still on the way.

Although GSoC is finished, but I will still go on working on rbyaml for improving and maintenance, and I also hope I could contribute to more and more opensource projects. I think that's the most important goal of GSoC, that more and more developer contribute for opensource projects, isn't it?

The future direction of rbyaml seems should be as following.
1. Syck Compatibility,
2. YAML new standard supporting, that include 1.1 and 1.2(still in drafts),
3. Should work well on Rubinius, Ruby1.9 and XRuby, etc.

Sunday, July 6, 2008

catch up with jvyamlb test, working on syck test now

GSoC mid-term evaluation is coming.
Last month, I was working on catching up with jvyamlb, now almost all of the current jvyamlb test could be passed successfully on rbyaml parser. I think it could be a little milestone of rbyaml.
Thanks to Ola Bini, the jvyamlb codebase really helped me a lot during fixing rbyaml defects.

These few days, I have started working on syck compatibility. Generally, the issues are about not implemented method and small defects when load/dump strange input stream.
I've already added some missing methods like parse and transform.

RbYAML still needs to be perfected. If you found any defects about rbyaml, please report to http://code.google.com/p/rbyaml/issues/list. Thanks very much.

Friday, May 30, 2008

load document start marker & timestamp

Still working on adding load_spec test, try to fix some bugs during the test process.
1. RbYAML should load "---" as a string, but not a document start marker. The document start marker should be "---\n", "---\t", "--- "(there is a blank after dash), etc.
RbYAML.load("---") => "---"


2. When load timestamp, should parse fraction of a second correctly.
Previously, "2001-12-15T02:59:43.1Z" will be loaded as an Time instance which the microsecond is 1000.
In my opinion, it should be 1e5 microsecond.(1 second equals to 1e6 microsecond)
another thing is if the digit number of fraction part is more than 6, it should parse as a round number. For example,
When "2001-12-15T02:59:43.1234567890000Z", the microsecond should be 123457.

Following is about CRuby.
1. In Syck(CRuby YAML), it will be loaded as 1234567890000 microsecond. then transform to corresponding second, minutes, hours...
Here it will be Sat Dec 15 03:35:30 UTC 2001, and the microsecond is 483647.
2. Time class. As a Time instance, the to_s and inspect method will only accurate to second. You should call usec method to get the microsecond. It might be a bit confusion when you compare two Time instances.

Saturday, May 24, 2008

change about RbYAML#load

These days, I was working on adding more spec test cases, most of them are from jvyamlb.

What have been changed?
1. RbYAML could load an empty instance, that means it doesn't has any value. Previously, rbyaml will throw exceptions during load an instance without value.
For example,
RbYAML.load("!!str") => ""
RbYAML.load("!!null") => nil
#nil is an instance of NilClass

2. RbYAML will return nil when load an empty stream
RbYAML.load("") => nil


Another thing is that I am going to support both YAML1.0 and YAML1.1 for there's differences between them.
You could use "require 'rbyaml_1.0'" to get a YAML1.0 parser.
But you should notice that YAML1.0 parsing is still on the way.

Tuesday, May 20, 2008

conflict between rbyaml and syck

(This conflict will not happen during you use rbyamlinstead of syck in your ruby compiler/interpreter.)

Most of time, you don't want to include yaml lib and rbyaml lib at the same time,
but sometimes yaml lib will be required implicitly during loading other common libraries.
For lots of libraries load yaml in therir implementations.
Here is an example.

What will reproduce the problem?
require 'rubygems' # or any file which contain yaml implicitly
require 'rbyaml'
:symbol.to_yaml # call to_yaml method of any base type object

What is the expected?
call to_yaml which defined in RbYAML library
What do you see instead?
call to_yaml which defined in SYCK library

Most of time the method from SYCK and RbYAML has the same behaviour, output.
But if there's defect in syck, that will effect rbyaml behaviour.
In this example, RbYAML only define to_yaml in Object class, when you call Symbol#to_yaml, it will call the super class method, Object#to_yaml.
But Syck define to_yaml in Object, Symbol and many base type. So if you load rbyaml after syck, rbyaml will certainly overwrite to_yaml method in Object, but will leave to_yaml method in other base type like as Symbol, Hash, etc.
After that, if you call Symbol#to_yaml, it will not call super class method which defined by RbYAML, but call Symbol#to_yaml defined in Syck. That's not our willing.

to_yaml method conflict has been fixed already, but I think there may still exist some other conflicts. I've add an issue about this in code.google.

Monday, May 19, 2008

portation of spec tests

Today, I port all the spec tests about yaml from rubinius to rbyaml.
In order to make all of them could be passed smoothly by rbyaml library, I fixed three small bugs.
1. As RbYAML.tagged_classes, Given int tag uri, Should return Integer .
RbYAML.tagged_classes["tag:yaml.org,2002:int"].should == Integer

2. As Symbol.to_yaml, Should return symbol YAML representation.
:symbol.to_yaml.should == "--- :symbol\n"

3. As RbYAML, Should have tagurize method.
RbYAML.tagurize('wtf').should == "tag:yaml.org,2002:wtf"
RbYAML.tagurize(1).should == 1

Rubinius and Syck really help me a lot during improving rbyaml.
Thank them.

Sunday, May 11, 2008

symbol parsing bug has been fixed

In my opinion, maybe rbyaml is ported from pyyaml. And in Python, there is no symbol class. So rbyaml doesn't support symbol parsing at first.

This week, I worked on symbol parsing bug. I spent lots of time on tracking the stream parsing flow with ruby-debug. During the tracking process, I added some spec tests to cover the method call stack.
After I find out the fixing solution. Then I found that bug had been fixed already in rubinius. It really helped me a lot.
Now, RbYAML can parse stream which inlcude symbol smoothly.
You can try it by following code.
RbYAML.load("---\n :firstname: Long\n lastname: Sun")
=> {:firstname=>"Long", "lastname"=>"Sun"}

Wednesday, May 7, 2008

syck bug

These days, I am working on symbol parsing.
I found a bug of syck.
In the document of ruby,
  YAML.parse( "--- :locked" )
#=> # @type_id="tag:ruby.yaml.org,2002:sym",
@value=":locked", @kind=:scalar>
but in fact,
irb(main):002:0> YAML.parse("--- :symbol").type_id
=> "tag:yaml.org,2002:str"

Monday, May 5, 2008

load symbol as string

Currently, I think loading symbol as string is a defect of rbyaml.
For example,
>RbYAML.load("--- :sym")
=> ":sym" # It's a string
>YAML.load("--- :sym")
=> :sym # It's a symbol


Starting working on that defect.
You can see more tasks and defects in following URL,
http://code.google.com/p/rbyaml/issues/list

spec test for rbyaml

These days, I have taken a look at rbyaml, yaml interface and jvyamlb. And now have a outline about yaml processor.
Comparing to rbyaml and MRI, the spec tests in rubinius are more detailed, that tests are for each method.
Formerly, I just wondered how to organize spec tests, however, now I think rubinius made a very good organization for spec tests.
So I ported yaml and rbyaml spec tests from rubinius to rbyaml.
Now I am working on modifying yaml spec tests, I want to make them becoming rbyaml spec tests.
During porting, I found some bugs under current rbyaml.
I am going to fix them and add more specs during fixing bugs.

Tuesday, April 29, 2008

Compared to MRI YAML interface, what rbyaml missing

After compared rbyaml.rb to yaml.rb which is from MRI, I've summarized some missing methods.
Mainly, we need some class variables for RbYAML now, just like parser, emitter and resolver(named as loader in RbYAML)

More information as follows.
  • rbyaml.rb[main interface]
    • missing method
      • parser
        • using resolver method
      • generic_parser
        • using GenericResolver
      • resolver
        • return DefaultResolver
      • emitter
        • using resolver method
      • parse
        • generic_parser.load
      • parse_file
        • parse
      • each_node
        • parser.load_documents
      • parse_documents
        • each_node
      • add_*_type
        • resolver.add_type(loader in rbyaml)
      • tagurize
        • resolver.tagurize
      • transfer
        • resolver.transfer
      • try_implicit
        • transfer & detect_implicit
      • quick_emit
        • using emmitter, there's quick_emit_node in RbYaml, there are not the same
      • Kernel.y
        • like p, y is a really convenient way to YAML.dump

Monday, April 28, 2008

Found a comrade

There is a rubyspec project in GSoC2008 by Federico Builes. It will support both rubinius and MRI.
(Specs for Ruby Standard Libraries)
Most glad thing is, it will also include YAML spec tests.
Maybe we can help each other about YAML spec.
:)

Fisheye for RbYAML

FishEye is a source repository browsing system that allows you to monitor, search and analyse changes in your codebase.

Now fisheye for rbyaml is available.
(http://fisheye2.cenqua.com/browse/rbyaml/)

after tried syck, going to try jvyamlb

These days, I've tried to start with porting MRI YAML test to RbYAML.
I used to think that will be just few differences between them. But finally, I found it's really hard to make tests pass by simple modification.
There's no parser interface like MRI currently, they were designed in a different structure.
After talking with my mentors, I suggested to have a look at jvyamlb which is the YAML lib in JRuby. Because it's more similar to RbYAML.

After these days experiment, I think that I should have a deeper comprehension for MRI YAML(syck) and rbyaml structure first.

Thursday, April 24, 2008

First code for rbyaml

Give a update to current status:

1. Mailing List available.
2. Added a tiny Rakefile.
3. Port yaml 45 test cases from CRuby to RbYAML. Running result,
57 tests, 57 assertions, 27 failures, 27 errors.
4. Created some issue/task in googlecode,
1) Clear up warnings.
2) Syck compatible.
3) Make orinigal test be passed.
5. FishEye for rbyaml is on the way.

Tuesday, April 22, 2008

Start my GSoC

It may be my last year in campus, and I am very glad that my GSoC proposal have been accpeted.
Xue Yong Zhi (XRuby organizer) and Ola Bini (RbYAML organizer) will act as my mentors.
That's really a good great chance for me to chisel in ruby community.
My application is to improve RbYAML which is a processor implement by pure ruby. Implementing by ruby means it could be used by most of the alternative ruby implementations directly.
Currently, I think the most import work is make RbYAML compatible. The further plan is getting a higher performance.

RbYAML is in Google Code now. (http://code.google.com/p/rbyaml/)
The original cvs history from rubyforge has stored as text into new code repository by Ola.

Wisshen to have a great experience this summer.