Coding Period Week 4 & 5
Week 4 & 5 progess (June 17 - June 30)
These two weeks were crucial for the project as the mid-term evaluation approached, and we needed to cover most of the work. We held a special meeting in addition to our weekly status call, where my mentors and I discussed the project in a more detailed and thorough manner.
After receiving valuable feedback from my mentors on my PR, I began working on resolving the issues.
Weekly Status Call (17th June)
Discussion
Following the implementation of the Post Scan Plugin, the agenda of this meeting focused on incorporating the license_clarity_score
attribute into the codebase scan for each individual package.
Implemented this with changes in the package_plugin
side.
- Firstly moved the
--package-summary
post scan plugin fromsummarycode/package_summary.py
in thepackagedcode/package_plugin.py
where the main scan_plugins likepackage
,system-package
,package-only
were there as normal scan options.- Changes in
setup.cfg
1 2 3 4 5
scancode_post_scan = package_summary = packagedcode.plugin_package:PackageSummary summary = summarycode.summarizer:ScanSummary tallies = summarycode.tallies:Tallies tallies-with-details = summarycode.tallies:TalliesWithDetails
- Changes in
plugin_package.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
from commoncode.cliutils import SCAN_GROUP, POST_SCAN_GROUP from plugincode.post_scan import post_scan_impl from plugincode.post_scan import PostScanPlugin ---------Previous-imports------------------------------------------ @post_scan_impl class PackageSummary(PostScanPlugin): """ Summary at the Package Level. """ options = [ PluggableCommandLineOption(('--package-summary',), is_flag=True, default=False, help='Generate Package Level summary', help_group=POST_SCAN_GROUP) ] def is_enabled(self, package_summary, **kwargs): return package_summary def process_codebase(self, codebase, package_summary, **kwargs): """ Process the codebase. """ if not self.is_enabled(package_summary): return
- Changes in
- Also renaming the attribute name from
license_clarity
tolicense_clarity_score
. - Regenerating tests in accordance with the changes ensures the code remains consistent and everything runs smoothly.
Checkout the commits- https://github.com/nexB/scancode-toolkit/pull/3792/commits
Attendees
- Philippe Ombredanne
- Ayan Sinha Mahapatra
- Jono Yang
- Omkar Phansopkar
- Ziad Hany
- Keshav Priyadarshi
- Pranay Das
- Michael Ehab
- Swastik Sharma
GSoC Sync Call (21st June)
Discussion
This was a special call arranged by my mentors to gain more clarity on the progress and how the project should meet its milestones.
Additionally, we discussed how the license_clarity_score
field would be included in the Package instance following the use of the postscan plugin.
Major implentation in the
PackageScanner
(Class that Scans a Resource for Package data and report these as “package_data” at the file level. Then create “packages” from these “package_data” at the top level.)- In the process codebase method of this class, there was need to include the package_summary flag as input so to have check if
package-summary
plugin is enabled or not.1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
def process_codebase(self, codebase, strip_root=False, package_only=False, package_summary=False, **kwargs): """ Populate the ``codebase`` top level ``packages`` and ``dependencies`` with package and dependency instances, assembling parsed package data from one or more datafiles as needed. Also perform additional package license detection that depends on either file license detection or the package detections. """ # If we only want purls, we want to skip both the package # assembly and the extra package license detection steps if package_only: return has_licenses = hasattr(codebase.root, 'license_detections') # These steps add proper license detections to package_data and hence # this is performed before top level packages creation for resource in codebase.walk(topdown=False): # populate `from_file` attribute in matches for package_data in resource.package_data: for detection in package_data['license_detections']: populate_matches_with_path( matches=detection['matches'], path=resource.path, ) for detection in package_data['other_license_detections']: populate_matches_with_path( matches=detection['matches'], path=resource.path, )
- In the process codebase method of this class, there was need to include the package_summary flag as input so to have check if
- Passing the
package_summary
flag to thecreate_package_and_deps
func that creates and save top-level Package and Dependency from the parsed package data present in the codebase.1 2 3
# Create codebase-level packages and dependencies create_package_and_deps(codebase, package_summary, strip_root=strip_root, **kwargs) #raise Exception()
- Updates in the
create_package_and_deps
func to have thelicense_clarity_score
in the package attribute when plugin was used.1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
def create_package_and_deps(codebase, package_summary= False ,package_adder=add_to_package, strip_root=False, **kwargs): """ Create and save top-level Package and Dependency from the parsed package data present in the codebase. """ packages, dependencies = get_package_and_deps( codebase, package_adder=package_adder, strip_root=strip_root, **kwargs ) codebase.attributes.packages.extend( package.to_dict(package_summary= package_summary) for package in packages ) codebase.attributes.dependencies.extend(dep.to_dict() for dep in dependencies)
- To make sure the Package Model had this attribute, updates in
models.py
- With the default value as False for the
package_summary
, theto_dict
method returns the package details withlicense_clarity_score
when only the plugin was used.1 2 3 4 5
def to_dict(self, package_summary= False): data = super().to_dict(with_details=False) if not package_summary: data.pop("license_clarity_score") return data
Attendees
1 2 3
- [Ayan Sinha Mahapatra](https://github.com/AyanSinhaMahapatra) - [Avishrant Sharma](https://github.com/AvishrantSsh) - [Swastik Sharma](https://github.com/swastkk)
- With the default value as False for the
Weekly Status Call (24th June)
Discussion
Major point that was dicussed in this meet, was to include the major changes that was there in scancode-toolkit develop branch to my Pull Request
Some of the tests were failing after those commits and needed to be regenerated. I regenerated the tests and reviewed the git diff
to ensure that only the changes related to the PR and commit were included.
Also recived a suggestion to have the license_clarity_score to be stored as dict
and not list
Also this week, we had a small test addition in the test_plugin_package
-
1
2
3
4
5
6
7
8
# just for test purpose, used change-case example
def test_plugin_package_with_package_summary(self):
test_dir = self.get_test_loc('package_summary/change-case-change-case-5.4.4.zip-extract')
result_file = self.get_temp_file('json')
expected_file = self.get_test_loc('package_summary/expected.json')
run_scan_click(['--package', '--strip-root', '--processes', '-2','--package-summary','--json-pp', result_file, test_dir])
check_json_scan(expected_file, result_file, remove_uuid=True, remove_file_date=True, regen=REGEN_TEST_FIXTURES)
With the dir structure as follows-
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
.
└── change-case-change-case-5.4.4
├── LICENSE
├── README.md
├── SECURITY.md
├── package-lock.json
├── package.json
├── packages
│ ├── change-case
│ │ ├── README.md
│ │ ├── package.json
│ │ ├── src
│ │ └── tsconfig.json
│ ├── sponge-case
│ │ ├── README.md
│ │ ├── package.json
│ │ ├── src
│ │ └── tsconfig.json
│ ├── swap-case
│ │ ├── README.md
│ │ ├── package.json
│ │ ├── src
│ │ └── tsconfig.json
│ └── title-case
│ ├── README.md
│ ├── package.json
│ ├── src
│ └── tsconfig.json
└── tsconfig.base.json
This type of example was chosen because it contains 4 packages, making it easy to run tests and providing more comprehensive information.