Coding Period Week 4 & 5
Week 4 & 5 progess (June 17 - June 30)
These two weeks were crucial for the project as the mid-term evaluation approached, and we needed to cover most of the work. We held a special meeting in addition to our weekly status call, where my mentors and I discussed the project in a more detailed and thorough manner.
After receiving valuable feedback from my mentors on my PR, I began working on resolving the issues.
Weekly Status Call (17th June)
Discussion
Following the implementation of the Post Scan Plugin, the agenda of this meeting focused on incorporating the license_clarity_score attribute into the codebase scan for each individual package.
Implemented this with changes in the package_plugin side.
- Firstly moved the
--package-summarypost scan plugin fromsummarycode/package_summary.pyin thepackagedcode/package_plugin.pywhere the main scan_plugins likepackage,system-package,package-onlywere there as normal scan options.- Changes in
setup.cfg1 2 3 4 5
scancode_post_scan = package_summary = packagedcode.plugin_package:PackageSummary summary = summarycode.summarizer:ScanSummary tallies = summarycode.tallies:Tallies tallies-with-details = summarycode.tallies:TalliesWithDetails
- Changes in
plugin_package.py1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
from commoncode.cliutils import SCAN_GROUP, POST_SCAN_GROUP from plugincode.post_scan import post_scan_impl from plugincode.post_scan import PostScanPlugin ---------Previous-imports------------------------------------------ @post_scan_impl class PackageSummary(PostScanPlugin): """ Summary at the Package Level. """ options = [ PluggableCommandLineOption(('--package-summary',), is_flag=True, default=False, help='Generate Package Level summary', help_group=POST_SCAN_GROUP) ] def is_enabled(self, package_summary, **kwargs): return package_summary def process_codebase(self, codebase, package_summary, **kwargs): """ Process the codebase. """ if not self.is_enabled(package_summary): return
- Changes in
- Also renaming the attribute name from
license_claritytolicense_clarity_score. - Regenerating tests in accordance with the changes ensures the code remains consistent and everything runs smoothly.
Checkout the commits- https://github.com/nexB/scancode-toolkit/pull/3792/commits
Attendees
- Philippe Ombredanne
- Ayan Sinha Mahapatra
- Jono Yang
- Omkar Phansopkar
- Ziad Hany
- Keshav Priyadarshi
- Pranay Das
- Michael Ehab
- Swastik Sharma
GSoC Sync Call (21st June)
Discussion
This was a special call arranged by my mentors to gain more clarity on the progress and how the project should meet its milestones.
Additionally, we discussed how the license_clarity_score field would be included in the Package instance following the use of the postscan plugin.
Major implentation in the
PackageScanner(Class that Scans a Resource for Package data and report these as “package_data” at the file level. Then create “packages” from these “package_data” at the top level.)- In the process codebase method of this class, there was need to include the package_summary flag as input so to have check if
package-summaryplugin is enabled or not.1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
def process_codebase(self, codebase, strip_root=False, package_only=False, package_summary=False, **kwargs): """ Populate the ``codebase`` top level ``packages`` and ``dependencies`` with package and dependency instances, assembling parsed package data from one or more datafiles as needed. Also perform additional package license detection that depends on either file license detection or the package detections. """ # If we only want purls, we want to skip both the package # assembly and the extra package license detection steps if package_only: return has_licenses = hasattr(codebase.root, 'license_detections') # These steps add proper license detections to package_data and hence # this is performed before top level packages creation for resource in codebase.walk(topdown=False): # populate `from_file` attribute in matches for package_data in resource.package_data: for detection in package_data['license_detections']: populate_matches_with_path( matches=detection['matches'], path=resource.path, ) for detection in package_data['other_license_detections']: populate_matches_with_path( matches=detection['matches'], path=resource.path, )
- In the process codebase method of this class, there was need to include the package_summary flag as input so to have check if
- Passing the
package_summaryflag to thecreate_package_and_depsfunc that creates and save top-level Package and Dependency from the parsed package data present in the codebase.1 2 3
# Create codebase-level packages and dependencies create_package_and_deps(codebase, package_summary, strip_root=strip_root, **kwargs) #raise Exception()
- Updates in the
create_package_and_depsfunc to have thelicense_clarity_scorein the package attribute when plugin was used.1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
def create_package_and_deps(codebase, package_summary= False ,package_adder=add_to_package, strip_root=False, **kwargs): """ Create and save top-level Package and Dependency from the parsed package data present in the codebase. """ packages, dependencies = get_package_and_deps( codebase, package_adder=package_adder, strip_root=strip_root, **kwargs ) codebase.attributes.packages.extend( package.to_dict(package_summary= package_summary) for package in packages ) codebase.attributes.dependencies.extend(dep.to_dict() for dep in dependencies)
- To make sure the Package Model had this attribute, updates in
models.py- With the default value as False for the
package_summary, theto_dictmethod returns the package details withlicense_clarity_scorewhen only the plugin was used.1 2 3 4 5
def to_dict(self, package_summary= False): data = super().to_dict(with_details=False) if not package_summary: data.pop("license_clarity_score") return data
Attendees
1 2 3
- [Ayan Sinha Mahapatra](https://github.com/AyanSinhaMahapatra) - [Avishrant Sharma](https://github.com/AvishrantSsh) - [Swastik Sharma](https://github.com/swastkk)
- With the default value as False for the
Weekly Status Call (24th June)
Discussion
Major point that was dicussed in this meet, was to include the major changes that was there in scancode-toolkit develop branch to my Pull Request
Some of the tests were failing after those commits and needed to be regenerated. I regenerated the tests and reviewed the git diff to ensure that only the changes related to the PR and commit were included.
Also recived a suggestion to have the license_clarity_score to be stored as dict and not list
Also this week, we had a small test addition in the test_plugin_package-
1
2
3
4
5
6
7
8
# just for test purpose, used change-case example
def test_plugin_package_with_package_summary(self):
test_dir = self.get_test_loc('package_summary/change-case-change-case-5.4.4.zip-extract')
result_file = self.get_temp_file('json')
expected_file = self.get_test_loc('package_summary/expected.json')
run_scan_click(['--package', '--strip-root', '--processes', '-2','--package-summary','--json-pp', result_file, test_dir])
check_json_scan(expected_file, result_file, remove_uuid=True, remove_file_date=True, regen=REGEN_TEST_FIXTURES)
With the dir structure as follows-
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
.
└── change-case-change-case-5.4.4
├── LICENSE
├── README.md
├── SECURITY.md
├── package-lock.json
├── package.json
├── packages
│ ├── change-case
│ │ ├── README.md
│ │ ├── package.json
│ │ ├── src
│ │ └── tsconfig.json
│ ├── sponge-case
│ │ ├── README.md
│ │ ├── package.json
│ │ ├── src
│ │ └── tsconfig.json
│ ├── swap-case
│ │ ├── README.md
│ │ ├── package.json
│ │ ├── src
│ │ └── tsconfig.json
│ └── title-case
│ ├── README.md
│ ├── package.json
│ ├── src
│ └── tsconfig.json
└── tsconfig.base.json
This type of example was chosen because it contains 4 packages, making it easy to run tests and providing more comprehensive information.



