Using vFunction to Analyze an Application without Deep Knowledge of the Code



Overview

For cases where the team has the code of a running application but has no deep knowledge of it or the application domain, one can use vFunction to gain insights that will help ramp up the developers to support and enhance the application.

Please see the topics below for Best Practices.


Sufficient Learning Coverage

vFunction relies on collecting data on the application as it runs by using the vFunction agent that is running on the application in a process called “Learning”. For best results, ensure sufficient coverage of the flows in the application during “Learning”. It is best to install the vFunction agent on a production instance of the application and collect the data until the number of functions and resources in the learning screen stop growing. If it is not possible to use the agent on production, use as many test scenarios (automated and manual) as possible, as it may be harder to ensure coverage for an unfamiliar application.


Dead Code Analysis

Determine if the coverage of the Application in Learning is sufficient by looking at the dead-code classes listed under the application domain. If you see classes that seem “familiar” or “important” then you should add specific tests to cover these flows during Learning.


Comprehensive Static Analysis

vFunction adds static analysis information to the dynamic analysis gathered during Learning. The process that does static analysis is called “Viper”. Ensure you have the complete set of binaries of the correct version provided to the Viper process, that is installed with the vFunction controller.


Discrepancies between Static and Dynamic Analysis

To ensure that all the binaries were provided to the Viper process, look in the vFunction “Analysis” screen and look for the notification (bell icon). A notification will be sent if there were any discrepancies such as classes that were detected by the agent during learning that do not appear in the analyzed binaries.


Scope Review

Determine the scope of the analysis:

  1. Review the namespaces or packages of the various classes in the codebase that make up the application. These namespaces should be specified in “Classes to Include” of the analysis parameters. Note: all sub-namespaces are automatically included so there is no need to specify the sub-namespaces (for example, no need to specify a.b.c. if a.b. is already specified)
  2. Review the jar graph (graph icon next to Infra Jars field) and identify infra-jars (3rd party jars or jars that you want to keep using as libraries as they are today). Any code that is not managed by the team working on the application should be considered as an infra jar.

Logical Domain Boundaries

Review the domains calculated by vFunction and see if you identify a domain that “makes sense” (i.e. that relates to your understanding of the business logic domain you’re expecting to see). Review the flows covered by this domain and its entry points, see which domains share the non-exclusive classes and based on that refine the entry points and assign classes to the common library. See if the entry points are called from a filter or interceptor or from the root to identify sources of new entry points.


Increasing the Domain Boundaries

Review the dynamic classes under application and their flows. See if you can identify new entry points of new domains.


Eliminating Circular Dependencies

If you reach a situation where you have circular domain dependencies (you can highlight them in orange), complex topology of domains or domains with 1-2 dynamic classes, consider merging the domains (drag one domain on top of the other, and confirm the merge).


Breaking Down Large Domains

If you have one or two domains that are big compared with the rest, review the entry points and consider separating them to different domains.


Automatically Detecting New Domains

If the classification percentage (analysis summary) is low and you see too many classes under the application domain, open the analysis parameters dialog (configure parameters) and reduce the minimum runtime parameter (say from 0.1% to 0.01%) – this will lower the threshold for the auto analysis to detect new domains for flows that have lower runtime utilization.


Identify Common Patterns

Identify common patterns used in the application: for example, are there classes with “Controller” in the name calling classes with “Service” in the name, are there “Facade” classes, are there “DAO” classes or “DTO” classes? To do that, use the global search, and the visual call tree.


Identify Common Entrypoint Patterns

Try to identify patterns for plausible entry points, and in the analysis parameters, specify these patterns in “Potential Entry Points”. For example, if you think that classes with names ending with Facade should have the entry points, specify *Facade. or if you think the methods with names called execute are entry points then you can specify *.execute(). This does NOT ensure all operations conforming to the pattern will become entry points since they need to meet additional criteria, but it should affect the analysis and save significant work. You can experiment with the patterns to see what domains are being defined.


Analyze High-Debt Classes

Review the high-debt code classes list to assess code complexity and see some of the flows they take part in – perhaps some of these classes should belong to the common library or maybe they should be flagged for refactoring.


Try Multiple Options

You can take snapshots of measurements (same learning data) and prepare different options for the domains to be reviewed and discussed. Once the preferred option is selected you can mark the measurement as baseline measurement and get the TODO items reflecting the quality and fitness of the application.